Nucleic acid that encodes a fusion protein

ABSTRACT

This invention provides fusion polypeptides that include a glycosyltransferase catalytic domain and a catalytic domain from an accessory enzyme that is involved in making a substrate for a glycosyltransferase reaction. Nucleic acids that encode the fusion polypeptides are also provided, as are host cells for expressing the fusion polypeptides of the invention.

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation-in-part of, and claims benefitof, U.S. Provisional Application No. 60/069,443, filed Dec. 15, 1997,which application is incorporated herein by reference for all purposes.

BACKGROUND OF THE INVENTION

[0002] 1. Field of the Invention

[0003] This invention pertains to the field of enzymatic synthesis ofoligosaccharides using fusion proteins that can catalyze more than onereaction involved in the enzymatic synthesis.

[0004] 2. Background

[0005] Increased understanding of the role of carbohydrates asrecognition elements on the surface of cells has led to increasedinterest in the production of carbohydrate molecules of definedstructure. For instance, compounds comprising the sialyl Lewis ligands,sialyl Lewis^(x) and sialyl Lewis^(a) are present in leukocyte andnon-leukocyte cell lines that bind to receptors such as the ELAM-1 andGMP 140 receptors. Polley et al., Proc. Natl. Acad. Sci. USA (1991) 88:6224 and Phillips et al. (1990) Science 250: 1130, see, also, U.S. Pat.No. 5,753,631.

[0006] Because of interest in making desired carbohydrate structures,glycosyltransferases and their role in enzyme-catalyzed synthesis ofcarbohydrates are presently being extensively studied. These enzymesexhibit high specificity and are useful in forming carbohydratestructures of defined sequence. Consequently, glycosyltransferases areincreasingly used as enzymatic catalysts in synthesis of a number ofcarbohydrates used for therapeutic and other purposes. In theapplication of enzymes to the field of synthetic carbohydrate chemistry,the use of glycosyltransferases for enzymatic synthesis of carbohydrateoffers advantages over chemical methods due to the virtually completestereoselectivity and linkage specificity offered by the enzymes (Ito etal. (1993) Pure Appl. Chem. 65: 753; and U.S. Pat. Nos. 5,352,670, and5,374,541).

[0007] Chemoenzymatic syntheses of oligosaccharides and of correspondingderivatives therefore represent an interesting opportunity to developnovel therapeutic agents. However this approach is still hampered by therelatively poor availability of the required glycosyltransferases andthe difficulty and cost of obtaining substrates for these enzymes.Large-scale enzymatic syntheses of oligosaccharides will also requirelarge amounts of the accessory enzymes necessary for the synthesis ofthe sugar-nucleotides that are used as the donors by theglycosyltransferases. The present invention provides fusion proteinsthat simplify the purification of enzymes that are useful for enzymaticsynthesis of oligosaccharides.

SUMMARY OF THE INVENTION

[0008] The present invention provides fusion polypeptides that areuseful for enzymatic synthesis of oligosaccharides. The fusionpolypeptides of the invention have a catalytic domain of aglycosyltransferase joined to a catalytic domain of an accessory enzyme.The accessory enzyme catalytic domain can, for example, catalyze a stepin the formation of a nucleotide sugar which is a donor for theglycosyltransferase, or catalyze a reaction involved in aglycosyltransferase cycle.

[0009] In another embodiment, the invention provides nucleic acids thatinclude a polynucleotide that encodes a fusion polypeptide. The fusionpolypeptides have a catalytic domain of a glycosyltransferase, and acatalytic domain of an accessory enzyme. Expression cassettes andexpression vectors that include the nucleic acids are also provided, asare host cells that contain the nucleic acids of the invention.

[0010] The invention also provides methods of producing a fusionpolypeptide that has a catalytic domain of a glycosyltransferase and acatalytic domain of an accessory enzyme. The methods involve introducinga nucleic acid that encodes the fusion polypeptide into a host cell toproduce a transformed host cell; and culturing the transformed host cellunder conditions appropriate for expressing the fusion polypeptide.

BRIEF DESCRIPTION OF THE DRAWINGS

[0011]FIG. 1 is a diagram of recombinant sialyltransferase/CMP-NeuAcsynthetase fusion protein of the invention. The C terminus of theCMP-Neu5Ac synthetase is linked covalently to the N terminus of theα-2,3-sialyltransferase through a 9-residue peptide linker. The firstMet residue of the α-2,3-sialyltransferase was replaced by a Leu residue(underlined in the linker sequence). The C terminus of the fusionprotein also includes a c-Myc epitope tag for immuno-detection and aHis₆ tail for purification by IMAC. The total length of the fusionprotein encoded by pFUS-01/2 is 625 residues.

[0012]FIG. 2 shows the nucleotide (SEQ ID NO: 1) and deduced amino acid(SEQ ID NO: 2) sequences of lgtB from Neisseria meningitidis.

[0013]FIG. 3 shows a diagram of a recombinant fusion protein thatcatalyzes transfer of galactose residues from a donor to an acceptor.The C terminus of the UDP-Glc/Gal epimerase is linked covalently to theN terminus of the β-1,4-Galactosyltransferase through a 4-residuepeptide linker. The first Met residue of the β-1,4-Galactosyltransferasewas replaced by a Val residue (underlined in the linker sequence). Thetotal length of the fusion protein encoded by pFUS-EB is 611 residues.

[0014]FIG. 4 shows primers that were used in the construction of theUDP-Glc/Gal epimerase/β-1,4-Galactosyltransferase fusion protein.

DETAILED DESCRIPTION

[0015] Definitions

[0016] The fusion proteins of the invention are useful for transferringa monosaccharide from a donor substrate to an acceptor molecule, and/orfor forming a reactant that is involved in the saccharide transferreaction. The addition generally takes place at the non-reducing end ofan oligosaccharide or carbohydrate moiety on a biomolecule. Biomoleculesas defined here include but are not limited to biologically significantmolecules such as carbohydrates, proteins (e.g., glycoproteins), andlipids (e.g., glycolipids, phospholipids, sphingolipids andgangliosides).

[0017] The following abbreviations are used herein:

[0018] Ara=arabinosyl;

[0019] Fru=fructosyl;

[0020] Fuc=fucosyl;

[0021] Gal=galactosyl;

[0022] GalNAc=N-acetylgalactosylamino;

[0023] Glc=glucosyl;

[0024] GlcNAc=N-acetylglucosylamino;

[0025] Man=mannosyl; and

[0026] NeuAc=sialyl (N-acetylneuraminyl).

[0027] Oligosaccharides are considered to have a reducing end and anon-reducing end, whether or not the saccharide at the reducing end isin fact a reducing sugar. In accordance with accepted nomenclature,oligosaccharides are depicted herein with the non-reducing end on theleft and the reducing end on the right.

[0028] All oligosaccharides described herein are described with the nameor abbreviation for the non-reducing saccharide (e.g., Gal), followed bythe configuration of the glycosidic bond (α or β), the ring bond, thering position of the reducing saccharide involved in the bond, and thenthe name or abbreviation of the reducing saccharide (e.g., GlcNAc). Thelinkage between two sugars may be expressed, for example, as 2,3, 2→3,or (2,3). Each saccharide is a pyranose or furanose.

[0029] Donor substrates for glycosyltransferases are activatednucleotide sugars. Such activated sugars generally consist of uridine,guanosine, and cytidine monophosphate or diphosphate derivatives of thesugars in which the nucleoside monophosphate or diphosphate serves as aleaving group. The donor substrate for sialyltransferases, for example,are activated sugar nucleotides comprising the desired sialic acid. Forinstance, in the case of NeuAc, the activated sugar is CMP-NeuAc.

[0030] The term “sialic acid” refers to 5-N-acetylneuraminic acid(NeuAc) or 5-N-glycolylneuraminic acid (NeuGc), as well as other sialicacids may be used in their place, however. For a review of differentforms of sialic acid suitable in the present invention see, Schauer,Methods in Enzymology, 50: 64-89 (1987), and Schaur, Advances inCarbohydrate Chemistry and Biochemistry, 40: 131-234.

[0031] A “fusion glycosyltransferase polypeptide” of the invention isglycosyltransferase fusion polypeptide that contains aglycosyltransferase catalytic domain and a second catalytic domain froman accessory enzyme (e.g., a CMP-Neu5Ac synthetase or a UDP-Glucose 4′epimerase (galE)) and is capable of catalyzing the transfer of anoligosaccharide residue from a donor substrate (e.g., CMP-NeuAc orUDP-Gal) to an acceptor molecule. Typically, such polypeptides will besubstantially similar to the exemplified proteins disclosed here.

[0032] An “accessory enzyme,” as referred to herein, is an enzyme thatis involved in catalyzing a reaction that, for example, forms asubstrate for a glycosyltransferase. An accessory enzyme can, forexample, catalyze the formation of a nucleotide sugar that is used as adonor moiety by a glycosyltransferase. An accessory enzyme can also beone that is used in the generation of a nucleotide triphosphate requiredfor formation of a nucleotide sugar, or in the generation of the sugarwhich is incorporated into the nucleotide sugar.

[0033] A “catalytic domain” refers to a portion of an enzyme that issufficient to catalyze an enzymatic reaction that is normally carriedout by the enzyme. For example, a catalytic domain of asialyltransferase will include a sufficient portion of thesialyltransferase to transfer a sialic acid residue from a donor to anacceptor saccharide. A catalytic domain can include an entire enzyme, asubsequence thereof, or can include additional amino acid sequences thatare not attached to the enzyme or subsequence as found in nature.

[0034] Much of the nomenclature and general laboratory proceduresrequired in this application can be found in Sambrook, et al., MolecularCloning: A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring HarborLaboratory, Cold Spring Harbor, N.Y., 1989. The manual is hereinafterreferred to as “Sambrook et al.”

[0035] The term “nucleic acid” refers to a deoxyribonucleotide orribonucleotide polymer in either single- or double-stranded form, andunless otherwise limited, encompasses known analogues of naturalnucleotides that hybridize to nucleic acids in manner similar tonaturally occurring nucleotides. Unless otherwise indicated, aparticular nucleic acid sequence includes the complementary sequencethereof.

[0036] The term “operably linked” refers to functional linkage between anucleic acid expression control sequence (such as a promoter, signalsequence, or array of transcription factor binding sites) and a secondnucleic acid sequence, wherein the expression control sequence affectstranscription and/or translation of the nucleic acid corresponding tothe second sequence.

[0037] A “heterologous sequence” or a “heterologous nucleic acid,” asused herein, is one that originates from a source foreign to theparticular host cell, or, if from the same source, is modified from itsoriginal form. Thus, a heterologous glycosyltransferase gene in aparticular host cell includes a glycosyltransferase gene thatis~endogenous to the particular host cell but has been modified.Modification of the heterologous nucleic acid can occur, e.g., bytreating the DNA with a restriction enzyme to generate a DNA fragmentthat is capable of being operably linked to the promoter. Techniquessuch as site-directed mutagenesis are also useful for modifying aheterologous nucleic acid.

[0038] A “subsequence” refers to a sequence of nucleic acids or aminoacids that comprise a part of a longer sequence of nucleic acids oramino acids (e.g., polypeptide) respectively.

[0039] The term “recombinant” when used with reference to a cellindicates that the cell replicates a heterologous nucleic acid, orexpresses a peptide or protein encoded by a heterologous nucleic acid.Recombinant cells can contain genes that are not found within the native(non-recombinant) form of the cell. Recombinant cells can also containgenes found in the native form of the cell wherein the genes aremodified and re-introduced into the cell by artificial means. The termalso encompasses cells that contain a nucleic acid endogenous to thecell that has been modified without removing the nucleic acid from thecell; such modifications include those obtained by gene replacement,site-specific mutation, and related techniques.

[0040] A “recombinant expression cassette” or simply an “expressioncassette” is a nucleic acid construct, generated recombinantly orsynthetically, with nucleic acid elements that are capable of affectingexpression of a structural gene in hosts compatible with such sequences.Expression cassettes include at least promoters and optionally,transcription termination signals. Typically, the recombinant expressioncassette includes a nucleic acid to be transcribed (e.g., a nucleic acidencoding a desired polypeptide), and a promoter. Additional factorsnecessary or helpful in effecting expression may also be used asdescribed herein. For example, an expression cassette can also includenucleotide sequences that encode a signal sequence that directssecretion of an expressed protein from the host cell. Transcriptiontermination signals, enhancers, and other nucleic acid sequences thatinfluence gene expression, can also be included in an expressioncassette.

[0041] The term “isolated” is meant to refer to material which issubstantially or essentially free from components which normallyaccompany the material as found in its native state. Thus, an isolatedmaterial does not include materials normally associated with their insitu environment. Typically, isolated proteins of the invention are atleast about 80% pure, usually at least about 90%, and preferably atleast about 95% pure as measured by band intensity on a silver stainedgel or other method for determining purity. Protein purity orhomogeneity can be indicated by a number of means well known in the art,such as polyacrylamide gel electrophoresis of a protein sample, followedby visualization upon staining. For certain purposes high resolutionwill be needed and HPLC or a similar means for purification utilized.

[0042] The terms “identical” or percent “identity,” in the context oftwo or more nucleic acids or polypeptide sequences, refer to two or moresequences or subsequences that are the same or have a specifiedpercentage of amino acid residues or nucleotides that are the same, whencompared and aligned for maximum correspondence, as measured using oneof the following sequence comparison algorithms or by visual inspection.

[0043] The phrase “substantially identical,” in the context of twonucleic acids or polypeptides, refers to two or more sequences orsubsequences that have at least 60%, preferably 80%, most preferably90-95% nucleotide or amino acid residue identity, when compared andaligned for maximum correspondence, as measured using one of thefollowing sequence comparison algorithms or by visual inspection.Preferably, the substantial identity exists over a region of thesequences that is at least about 50 residues in length, more preferablyover a region of at least about 100 residues, and most preferably thesequences are substantially identical over at least about 150 residues.In a most preferred embodiment, the sequences are substantiallyidentical over the entire length of the coding regions.

[0044] For sequence comparison, typically one sequence acts as areference sequence, to which test sequences are compared. When using asequence comparison algorithm, test and reference sequences are inputinto a computer, subsequence coordinates are designated, if necessary,and sequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

[0045] Optimal alignment of sequences for comparison can be conducted,e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl.Math. 2:482 (1981), by the homology alignment algorithm of Needleman &Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity methodof Pearson & Lipman, Proc. Nat'l. Acad. Sci. USA 85:2444 (1988), bycomputerized implementations of these algorithms. (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by visual inspection (seegenerally, Current Protocols in Molecular Biology, F. M. Ausubel et al.,eds., Current Protocols, a joint venture between Greene PublishingAssociates, Inc. and John Wiley & Sons, Inc., (1995 Supplement)(Ausubel)).

[0046] Examples of algorithms that are suitable for determining percentsequence identity and sequence similarity are the BLAST and BLAST 2.0algorithms, which are described in Altschul et al. (1990) J. Mol. Biol.215: 403-410 and Altschuel et al. (1977) Nucleic Acids Res. 25:3389-3402, respectively. Software for performing BLAST analyses ispublicly available through the National Center for BiotechnologyInformation (http://www.ncbi.nlm.nih.gov/). This algorithm involvesfirst identifying high scoring sequence pairs (HSPs) by identifyingshort words of length W in the query sequence, which either match orsatisfy some positive-valued threshold score T when aligned with a wordof the same length in a database sequence. T is referred to as theneighborhood word score threshold (Altschul et al, supra). These initialneighborhood word hits act as seeds for initiating searches to findlonger HSPs containing them. The word hits are then extended in bothdirections along each sequence for as far as the cumulative alignmentscore can be increased. Cumulative scores are calculated using, fornucleotide sequences, the parameters M (reward score for a pair ofmatching residues; always >0) and N (penalty score for mismatchingresidues; always <0). For amino acid sequences, a scoring matrix is usedto calculate the cumulative score. Extension of the word hits in eachdirection are halted when: the cumulative alignment score falls off bythe quantity X from its maximum achieved value; the cumulative scoregoes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLASTN program (fornucleotide sequences) uses as defaults a wordlength (W) of 11, anexpectation (E) of 10, M=5, N=−4, and a comparison of both strands. Foramino acid sequences, the BLASTP program uses as defaults a wordlength(W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix (seeHenikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).

[0047] In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin & Altschul, Proc. Nat'l . Acad. Sci.USA 90:5873-5787 (1993)). One measure of similarity provided by theBLAST algorithm is the smallest sum probability (P(N)), which providesan indication of the probability by which a match between two nucleotideor amino acid sequences would occur by chance. For example, a nucleicacid is considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

[0048] A further indication that two nucleic acid sequences orpolypeptides are substantially identical is that the polypeptide encodedby the first nucleic acid is immunologically cross reactive with thepolypeptide encoded by the second nucleic acid, as described below.Thus, a polypeptide is typically substantially identical to a secondpolypeptide, for example, where the two peptides differ only byconservative substitutions. Another indication that two nucleic acidsequences are substantially identical is that the two moleculeshybridize to each other under stringent conditions, as described below.

[0049] The phrase “hybridizing specifically to”, refers to the binding,duplexing, or hybridizing of a molecule only to a particular nucleotidesequence under stringent conditions when that sequence is present in acomplex mixture (e.g., total cellular) DNA or RNA.

[0050] The term “stringent conditions” refers to conditions under whicha probe will hybridize to its target subsequence, but to no othersequences. Stringent conditions are sequence-dependent and will bedifferent in different circumstances. Longer sequences hybridizespecifically at higher temperatures. Generally, stringent conditions areselected to be about 15° C. lower than the thermal melting point (Tm)for the specific sequence at a defined ionic strength and pH. The Tm isthe temperature (under defined ionic strength, pH, and nucleic acidconcentration) at which 50% of the probes complementary to the targetsequence hybridize to the target sequence at equilibrium. (As the targetsequences are generally present in excess, at Tm, 50% of the probes areoccupied at equilibrium). Typically, stringent conditions will be thosein which the salt concentration is less than about 1.0 M Na ion,typically about 0.01 to 1.0 M Na ion concentration (or other salts) atpH 7.0 to 8.3 and the temperature is at least about 30° C. for shortprobes (e.g., 10 to 50 nucleotides) and at least about 60° C. for longprobes (e.g., greater than 50 nucleotides). Stringent conditions mayalso be achieved with the addition of destabilizing agents such asformamide.

[0051] The phrases “specifically binds to a protein” or “specificallyimmunoreactive with”, when referring to an antibody refers to a bindingreaction which is determinative of the presence of the protein in thepresence of a heterogeneous population of proteins and other biologics.Thus, under designated immunoassay conditions, the specified antibodiesbind preferentially to a particular protein and do not bind in asignificant amount to other proteins present in the sample. Specificbinding to a protein under such conditions requires an antibody that isselected for its specificity for a particular protein. A variety ofimmunoassay formats may be used to select antibodies specificallyimmunoreactive with a particular protein. For example, solid-phase ELISAimmunoassays are routinely used to select monoclonal antibodiesspecifically immunoreactive with a protein. See Harlow and Lane (1988)Antibodies, A Laboratory Manual, Cold Spring Harbor Publications, NewYork, for a description of immunoassay formats and conditions that canbe used to determine specific immunoreactivity.

[0052] “Conservatively modified variations” of a particularpolynucleotide sequence refers to those polynucleotides that encodeidentical or essentially identical amino acid sequences, or where thepolynucleotide does not encode an amino acid sequence, to essentiallyidentical sequences. Because of the degeneracy of the genetic code, alarge number of functionally identical nucleic acids encode any givenpolypeptide. For instance, the codons CGU, CGC, CGA, CGG, AGA, and AGGall encode the amino acid arginine. Thus, at every position where anarginine is specified by a codon, the codon can be altered to any of thecorresponding codons described without altering the encoded polypeptide.Such nucleic acid variations are “silent variations,” which are onespecies of “conservatively modified variations.” Every polynucleotidesequence described herein which encodes a polypeptide also describesevery possible silent variation, except where otherwise noted. One ofskill will recognize that each codon in a nucleic acid (except AUG,which is ordinarily the only codon for methionine, and UGG which isordinarily the only codon for tryptophan) can be modified to yield afunctionally identical molecule by standard techniques. Accordingly,each “silent variation” of a nucleic acid which encodes a polypeptide isimplicit in each described sequence.

[0053] Furthermore, one of skill will recognize that individualsubstitutions, deletions or additions which alter, add or delete asingle amino acid or a small percentage of amino acids (typically lessthan 5%, more typically less than 1%) in an encoded sequence are“conservatively modified variations” where the alterations result in thesubstitution of an amino acid with a chemically similar amino acid.Conservative substitution tables providing functionally similar aminoacids are well known in the art.

[0054] One of skill will appreciate that many conservative variations ofthe fusion proteins and nucleic acid which encode the fusion proteinsyield essentially identical products. For example, due to the degeneracyof the genetic code, “silent substitutions” (i.e., substitutions of anucleic acid sequence which do not result in an alteration in an encodedpolypeptide) are an implied feature of every nucleic acid sequence whichencodes an amino acid. As described herein, sequences are preferablyoptimized for expression in a particular host cell used to produce thechimeric endonucleases (e.g., yeast, human, and the like). Similarly,“conservative amino acid substitutions,” in one or a few amino acids inan amino acid sequence are substituted with different amino acids withhighly similar properties (see, the definitions section, supra), arealso readily identified as being highly similar to a particular aminoacid sequence, or to a particular nucleic acid sequence which encodes anamino acid. Such conservatively substituted variations of any particularsequence are a feature of the present invention. See also, Creighton(1984) Proteins, W. H. Freeman and Company. In addition, individualsubstitutions, deletions or additions which alter, add or delete asingle amino acid or a small percentage of amino acids in an encodedsequence are also “conservatively modified variations”.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0055] The present invention provides fusion polypeptides that include aglycosyltransferase catalytic domain and at least one catalytic domainof one or more accessory enzymes. Accessory enzymes can, for example,catalyze a step in the formation of a nucleotide sugar which is a donorfor the glycosyltransferase. Nucleic acids that encode the fusionpolypeptides are also provided, as are expression vectors and host cellsthat include these nucleic acids.

[0056] The fusion polypeptides of the invention find use in theenzymatic synthesis of oligosaccharides. Significant advantages areprovided by the fusion polypeptides. For example, the use of a fusionpolypeptide that has two or more enzymatic activities reduces the numberof polypeptides that must be obtained for a given synthesis. Thus,purification is simplified.

[0057] A. Glycosyltransferases

[0058] The fusion polypeptides of the invention include a catalyticdomain of a glycosyltransferase. The catalytic domain can be from any ofa wide variety of glycosyltransferases. Among the glycosyltransferasesfrom one which one can obtain a catalytic domain are thesialyltransferases, N-acetylglucosaminyltransferases,N-acetylgalactosaminyltransferases, fucosyltransferases,galactosyltransferases, glucosyltransferases, xylosyltransferases, andmannosyltransferases.

[0059] The glycosyltransferases can be either prokaryotic or eukaryoticglycosyltransferases.

[0060] Eukaryotic Glycosyltransferases

[0061] The fusion polypeptides of the present invention can include acatalytic domain of a eukaryotic glycosyltransferase. Eukaryoticglycosyltransferases typically have topological domains at their aminoterminus that are not required for catalytic activity (see, U.S. Pat.No. 5,032,519). The “cytoplasmic domain,” which is most commonly betweenabout 1 and about 10 amino acids in length, is the most amino-terminaldomain. The adjacent domain, termed the “signal-anchor domain,” isgenerally between about 10-26 amino acids in length. Adjacent to thesignal-anchor domain is a “stem region,” which is typically betweenabout 20 and about 60 amino acids in length. The stem region functionsas a retention signal to maintain the glycosyltransferase in the Golgiapparatus. The catalytic domain of the glycosyltransferase is found tothe carboxyl side of the stem region.

[0062] In a presently preferred embodiment, the glycosyltransferasecatalytic domains that are present in the fusion proteins of theinvention substantially lack one or more of the cytoplasmic,signal-anchor, and stem region domains. More preferably, two of thesedomains are at least substantially absent from the fusion protein, andmost preferably all three of the cytoplasmic domain, the signal-anchordomain, and the stem region are substantially or completely absent fromthe fusion proteins of the invention.

[0063] Many mammalian glycosyltransferases have been cloned andexpressed and the recombinant proteins have been characterized in termsof donor and acceptor specificity and they have also been investigatedthrough site directed mutagenesis in attempts to define residuesinvolved in either donor or acceptor specificity (Aoki et al. (1990)EMBO. J. 9: 3171-3178; Harduin-Lepers et al. (1995) Glycobiology 5(8):741-758; Natsuka and Lowe (1994) Current Opinion in Structural Biology4: 683-691; Zu et al. (1995) Biochem. Biophys. Res. Comm. 206(1):362-369; Seto et al. (1995) Eur. J. Biochem. 234: 323-328; Seto et al.(1997) J. Biol. Chem. 272: 14133-141388).

[0064] In some embodiments, the glycosyltransferase catalytic domain isobtained from a fucosyltransferase. A number of fucosyltransferases areknown to those of skill in the art. Briefly, fucosyltransferases includeany of those enzymes which transfer L-fucose from GDP-fucose to ahydroxy position of an acceptor sugar. In some embodiments, for example,the acceptor sugar is a GlcNAc in a Galβ(1→4)GlcNAc group in anoligosaccharide glycoside. Suitable fucosyltransferases for thisreaction include the known Galβ (1→3,4)GlcNAc α(1→3,4)fucosyltransferase(FTIII, E.C. No. 2.4.1.65) which is obtained from human milk (see,Palcic, et al., Carbohydrate Res. 190:1-11 (1989); Prieels, et al., J.Biol. Chem. 256: 10456-10463 (1981); and Nunez, et al., Can. J. Chem.59: 2086-2095 (1981)) and the Galβ(1→4)GlcNAc α(1→3)fucosyltransferases(FTIV, FTV, FTVI, and FTVII, E.C. No. 2.4.1.65) which are found in humanserum. A recombinant form of Galβ (1→3,4)GlcNAcα(1→3,4)fucosyltransferase is also available (see, Dumas, et al.,Bioorg. Med. Letters 1:425-428 (1991) and Kukowska-Latallo, et al.,Genes and Development 4:1288-1303 (1990)). Other exemplaryfucosyltransferases include α1,2 fucosyltransferase (E.C. No. 2.4.1.69).Enzymatic fucosylation can be carried out by the methods described inMollicone, et al., Eur. J. Biochem. 191:169-176 (1990) or U.S. Pat. No.5,374,655.

[0065] In another group of embodiments, the glycosyltransferasecatalytic domain is obtained from a galactosyltransferase. Exemplarygalactosyltransferases include α1,3-galactosyltransferases (E.C. No.2.4.1.151, see, e.g., Dabkowski et al., Transplant Proc. 25:2921 (1993)and Yamamoto et al. Nature 345:229-233 (1990), bovine (GenBank j04989,Joziasse et al. (1989) J. Biol. Chem. 264:14290-14297), murine (GenBankm26925; Larsen et al. (1989) Proc. Nat'l. Acad. Sci. USA 86:8227-8231),porcine (GenBank L36152; Strahan et al (1995) Immunogenetics41:101-105)). Another suitable α1,3-galactosyltransferase is that whichis involved in synthesis of the blood group B antigen (EC 2.4.1.37,Yamamoto et al. (1990) J. Biol. Chem. 265:1146-1151 (human)). Alsosuitable for use in the fusion polypeptides of the invention areα1,4-galactosyltransferases, which include, for example, EC 2.4.1.90(LacNAc synthetase) and EC 2.4.1.22 (lactose synthetase) (bovine(D'Agostaro et al (1989) Eur. J. Biochem. 183:211-217), human (Masri etal. (1988) Biochem. Biophys. Res. Commun. 157:657-663), murine (Nakazawaet al (1988) J. Biochem. 104:165-168), as well as E.C. 2.4.1.38 and theceramide galactosyltransferase (EC 2.4.1.45, Stahl et al. (1994) J.Neurosci. Res. 38:234-242). Other suitable galactosyltransferasesinclude, for example, α1,2-galactosyltransferases (from e.g.,Schizosaccharomyces pombe, Chapell et al (1994) Mol. Biol. Cell5:519-528).

[0066] Sialyltransferases are another type of glycosyltransferase thatis useful in the recombinant cells and reaction mixtures of theinvention. Examples of sialyltransferases that are suitable for use inthe present invention include ST3Gal III (preferably a rat ST3Gal III),ST3Gal IV, ST3Gal I, ST6Gal I, ST3Gal V, ST6Gal II, ST6GalNAc I,ST6GalNAc II, and ST6GalNAc III (the sialyltransferase nomenclature usedherein is as described in Tsuji et al. (1996) Glycobiology 6: v-xiv). Anexemplary α2,3-sialyltransferase (EC 2.4.99.6) transfers sialic acid tothe non-reducing terminal Gal of a Galβ1→4GlcNAc disaccharide orglycoside. See, Van den Eijnden et al., J. Biol. Chem., 256:3159 (1981),Weinstein et al., J. Biol. Chem., 257:13845 (1982) and Wen et al., J.Biol. Chem., 267:21011 (1992). Another exemplary α2,3-sialyltransferase(EC 2.4.99.4) transfers sialic acid to the non-reducing terminal Gal ofa Galβ1→3GalNAc disaccharide or glycoside. See, Rearick et al., J. Biol.Chem., 254: 4444 (1979) and Gillespie et al., J. Biol. Chem., 267:21004(1992). Further exemplary enzymes include Gal-β-1,4-GlcNAc α-2,6sialyltransferase (See, Kurosawa et al. Eur. J. Biochem. 219: 375-381(1994)). Sialyltransferase nomenclature is described in Tsuji, S. et al.(1996) Glycobiology 6:v-vii.

[0067] Other glycosyltransferases that can used in the fusionpolypeptides of the invention have been described in detail, as for thesialyltransferases, galactosyltransferases, and fucosyltransferases. Inparticular, the glycosyltransferase can also be, for instance,glucosyltransferases, e.g., Alg8 (Stagljov et al., Proc. Natl. Acad.Sci. USA 91:5977 (1994)) or Alg5 (Heesen et al. Eur. J. Biochem. 224:71(1994)), N-acetylgalactosaminyltransferases such as, for example,β(1,3)-N-acetylgalactosaminyltransferase,β(1,4)-N-acetylgalactosaminyltransferases (U.S. Pat. No. 5,691,180,Nagata et al. J. Biol. Chem. 267:12082-12089 (1992), and Smith et al. J.Biol Chem. 269:15162 (1994)) and polypeptideN-acetylgalactosaminyltransferase (Homa et al. J. Biol Chem. 268:12609(1993)). Suitable N-acetylglucosaminyltransferases include GnTI(2.4.1.101, Hull et al., BBRC 176:608 (1991)), GnTII, and GnTIII (Iharaet al. J. Biochem. 113:692 (1993)), GnTV (Shoreiban et al. J. Biol.Chem. 268: 15381 (1993)), O-linked N-acetylgalactosaminyltransferase(Bierhuizen et al. Proc. Natl. Acad. Sci. USA 89:9326 (1992)),N-acetylglucosamine-1-phosphate transferase (Rajput et al. Biochem J.285:985 (1992), and hyaluronan synthase. Also of interest are enzymesinvolved in proteoglycan synthesis, such as, for example,N-acetylgalactosaminyltransferase I (EC 2.4.1.174), and enzymes involvedin chondroitin sulfate synthesis, such asN-acetylgalactosaminyltransferase II (EC 2.4.1.175). Suitablemannosyltransferases include α(1,2) mannosyltransferase, α(1,3)mannosyltransferase, β(1,4) mannosyltransferase, Dol-P-Man synthase,OCh1, and Pmt1. Xylosyltransferases include, for example, proteinxylosyltransferase (EC 2.4.2.26).

[0068] Prokaryotic Glycosyltransferases

[0069] In other embodiments, the fusion proteins of the inventioninclude a glycosyltransferase catalytic domain from a prokaryoticglycosyltransferase. Nucleic acids encoding several prokaryoticglycosyltransferases have been cloned and characterized, and can be usedin the fusion proteins of the invention. As is the case for eukaryoticglycosyltransferases, prokaryotic glycosyltransferases often have amembrane-spanning domain near the amino terminus that can be omitted, ifdesired, from the fusion polypeptide.

[0070] Suitable prokaryotic glycosyltransferases include enzymesinvolved in synthesis of lipooligosaccharides (LOS), which are producedby many Gram negative bacteria. The LOS typically have terminal glycansequences that mimic glycoconjugates found on the surface of humanepithelial cells or in host secretions (Preston et al. (1996) CriticalReviews in Microbiology 23(3): 139-180). Such enzymes include, but arenot limited to, the proteins of the rfa operons of species such as E.coli and Salmonella typhimurium, which include aα1,6-galactosyltransferase and a α1,3-galactosyltransferase (see, e.g.,EMBL Accession Nos. M80599 and M86935 (E. coli); EMBL Accession No.S56361 (S. typhimurium)), a glucosyltransferase (Swiss-Prot AccessionNo. P25740 (E. coli), an α1,2-glucosyltransferase (rfaJ)(Swiss-ProtAccession No. P27129 (E. coli) and Swiss-Prot Accession No. P19817 (S.typhimurium)), and an α1,2-N-acetylglucosaminyltransferase (rfaK)(EMBLAccession No. U00039 (E. coli). Other glycosyltransferases for whichamino acid and/or nucleic acid sequences are known include those thatare encoded by operons such as rfaB, which have been characterized inorganisms such as Klebsiella pneumoniae, E. coli, Salmonellatyphimurium, Salmonella enterica, Yersinia enterocolitica, Mycobacteriumleprosum, and the rh1 operon of Pseudomonas aeruginosa.

[0071] Also suitable for use in the fusion proteins of the invention areglycosyltransferases that are involved in producing structurescontaining lacto-N-neotetraose,D-galactosyl-β-1,4-N-acetyl-D-glucosaminyl-β-1,3-D-galactosyl-β-1,4-D-glucose,and the P^(k) blood group trisaccharide sequence,D-galactosyl-α-1,4-D-galactosyl-β-1,4-D-glucose, which have beenidentified in the LOS of the mucosal pathogens Neisseria gonnorhoeae andN. meningitidis (Scholten et al. (1994) J. Med. Microbiol. 41: 236-243).The genes from N. meningitidis and N. gonorrhoeae that encode theglycosyltransferases involved in the biosynthesis of these structureshave been identified from N. meningitidis immunotypes L3 and L1(Jennings et al. (1995) Mol. Microbiol. 18: 729-740) and the N.gonorrhoeae mutant F62 (Gotshlich (1994) J. Exp. Med. 180: 2181-2190).In N. meneingitides, a locus consisting of 3 genes, lgtA, lgtB and lgE,encodes the glycosyltransferase enzymes required for addition of thelast three of the sugars in the lacto-N-neotetraose chain (Wakarchuk etal. (1996) J. Biol. Chem. 271: 19166-73). Recently the enzymaticactivity of the lgtB and lgtA gene product was demonstrated, providingthe first direct evidence for their proposed glycosyltransferasefunction (Wakarchuk et al. (1996) J. Biol. Chem. 271 (45): 28271-276).In N. gonorrhoeae, there are two additional genes, lgtD which addsβ-D-GalNAc to the 3 position of the terminal galactose of thelacto-N-neotetraose structure and lgtC which adds a terminal α-D-Gal tothe lactose element of a truncated LOS, thus creating the P^(k) bloodgroup antigen structure (Gotshlich (1994), supra.). In N. meningitidis,a separate immunotype L1 also expresses the P^(k) blood group antigenand has been shown to carry an lgtC gene (Jennings et al. (1995),supra.). Neisseria glycosyltransferases and associated genes are alsodescribed in U.S. Pat. No. 5,545,553 (Gotschlich). Anα1,3-fucosyltransferase gene from Helicobacter pylori has also beencharacterized (Martin et al. (1997) J. Biol. Chem. 272: 21349-21356).

[0072] Sialyltransferases from prokaryotes have been described by, forexample, Weisgerber et al. (1991) Glycobiol. 1:357-365; Frosch, M. etal. (1991) Mol. Microbiol. 5:1251-1263; and Gilbert, M. et al. (1996) J.Biol. Chem. 271:28271-28276. It has been suggested that the bacterialsialyltransferases might have a wider spectrum of acceptors than theirmammalian counterparts (Kajihara, Y. et al. (1996) J. Org. Chem.61:8632-8635 and Gilbert et al., Eur. J. Biochem. 249: 187-194 (1997)).

[0073] As is the case for eukaryotic glycosyltransferases, one canreadily obtain nucleic acids that encode other prokaryoticglycosyltransferases that can be used in constructing fusionpolypeptides according to the invention.

[0074] B. Accessory Enzymes Involved in Nucleotide Sugar Formation

[0075] The fusion polypeptides of the invention include, in addition tothe glycosyltransferase catalytic domain(s), at least one catalyticdomain from an accessory enzyme. Accessory enzymes include, for example,those enzymes that are involved in the formation of a nucleotide sugar.The accessory enzyme can be involved in attaching the sugar to anucleotide, or can be involved in making the sugar or the nucleotide,for example. The nucleotide sugar is generally one that is utilized as asaccharide donor by the glycosyltransferase catalytic domain of theparticular fusion polypeptide. Examples of nucleotide sugars that areused as sugar donors by glycosyltransferases include, for example,GDP-Man, UDP-Glc, UDP-Gal, UDP-GlcNAc, UDP-GalNAc, CMP-sialic acid,UDP-xylose, GDP-Fuc, GDP-GlcNAc, among others.

[0076] Accessory enzymes that are involved in synthesis of nucleotidesugars are well known to those of skill in the art. For a review ofbacterial polysaccharide synthesis and gene nomenclature, see, e.g.,Reeves et al., Trends Microbiol. 4: 495-503 (1996). The methodsdescribed above for obtaining glycosyltransferase-encoding nucleic acidsare also applicable to obtaining nucleic acids that encode enzymesinvolved in the formation of nucleotide sugars. For example, one can useone of nucleic acids known in the art, some of which are listed below,directly or as a probe to isolate a corresponding nucleic acid fromother organisms of interest.

[0077] As one example, to produce a galactosylated solubleoligosaccharide, a galactosyltransferase is often used. However,galactosyltransferases generally use as a galactose donor the activatednucleotide sugar UDP-Gal, which is comparatively expensive. To reducethe expense of the reaction, one can construct one or more fusionpolypeptides that have the galactosyltransferase catalytic domain andalso a catalytic domain from one of the accessory enzymes that areinvolved in the biosynthetic pathway which leads to UDP-Gal. Forexample, glucokinase (EC 2.7.1.12) catalyzes the phosphorylation ofglucose to form Glc-6-P. Genes that encode glucokinase have beencharacterized (e.g., E. coli: GenBank AE000497 U00096, Blattner et al.,Science 277: 1453-1474 (1997); Bacillus subtilis: GenBank Z99124,AL009126, Kunst et al., Nature 390, 249-256 (1997)), and thus can bereadily obtained from many organisms by, for example, hybridization oramplification. A fusion polypeptide that contains a catalytic domainfrom this enzyme, as well as those of the subsequent enzymes in thepathway as set forth below, will thus be able to form UDP-glucose fromreadily available glucose, which can be either produced by the organismor added to the reaction mixture.

[0078] The next step in the pathway leading to UDP-Gal is catalyzed byphosphoglucomutase (EC 5.4.2.2), which converts Glc-6-P to Glc-1-P.Again, genes encoding this enzyme have been characterized for a widerange of organisms (e.g., Agrobacterium tumefaciens: GenBank AF033856,Uttaro et al. Gene 150: 117-122 (1994) [published erratum appears inGene (1995) 155:141-3]; Entamoeba histolytica: GenBank Y14444, Ortner etal., Mol. Biochem. Parasitol. 90, 121-129 (1997); Mesembryanthemumcrystallinum: GenBank U84888; S. cerevisiae: GenBank X72016, U09499,X74823, Boles et al., Eur. J. Biochem. 220: 83-96 (1994), Fu et al., J.Bacteriol. 177 (11), 3087-3094 (1995); human: GenBank M83088 (PGM1),Whitehouse et al., Proc. Nat'l. Acad. Sci. U.S.A. 89: 411-415 (1992),Xanthomonas campestris: GenBank M83231, Koeplin et al., J. Bacteriol.174: 191-199 (1992); Acetobacter xylinum: GenBank L24077, Brautaset etal., Microbiology 140 (Pt 5), 1183-1188 (1994); Neisseria meningitidis:GenBank U02490, Zhou et al., J. Biol. Chem. 269 (15), 11162-11169(1994).

[0079] UDP-glucose pyrophosphorylase (EC 2.7.7.9) catalyzes the nextstep in the pathway, conversion of Glc-1-P to UDP-Glc. Genes encodingUDP-Glc pyrophosphorylase are described for many organisms (e.g., E.coli: GenBank M98830, Weissbom et al., J. Bacteriol. 176: 2611-2618(1994); Cricetulus griseus: GenBank AF004368, Flores-Diaz et al., J.Biol. Chem. 272: 23784-23791 (1997); Acetobacter xylinum: GenBankM76548, Brede et al., J. Bacteriol. 173, 7042-7045 (1991); Pseudomonasaeruginosa (galU): GenBank AJ010734, U03751; Streptococcus pneumoniae:GenBank AJ004869; Bacillus subtilis: GenBank Z22516, L12272; Soldo etal., J. Gen. Microbiol. 139 (Pt 12), 3185-3195 (1993); Solanumtuberosum: GenBank U20345, L77092, L77094, L77095, L77096, L77098,U59182, Katsube et al., J. Biochem. 108: 321-326 (1990); Hordeum vulgare(barley): GenBank X91347; Shigella flexneri: GenBank L32811, Sandlin etal., Infect. Immun. 63: 229-237 (1995); human: GenBank U27460, Dugglebyet al., Eur. J. Biochem. 235 (1-2), 173-179 (1996); bovine: GenBankL14019, Konishi et al., J. Biochem. 114, 61-68 (1993).

[0080] Finally, UDP-Glc 4′-epimerase (UDP-Gal 4′ epimerase; EC 5.1.3.2)catalyzes the conversion of UDP-Glc to UDP-Gal. The Streptococcusthermophilus UDP galactose 4-epimerase gene described by Poolman et al.(J. Bacteriol 172: 4037-4047 (1990)) is a particular example of a genethat is useful in the present invention. Exemplary genes encoding UDPglucose 4-epimerase include those of E. coli, K. pneumoniae, S.lividans, and E. stewartii, as well as Salmonella and Streptococcusspecies. Nucleotide sequences are known for UDP-Glc 4′-epimerases fromseveral organisms, including Pasteurella haemolytica, GenBank U39043,Potter et al., Infect. Immun. 64 (3), 855-860 (1996); Yersiniaenterocolitica, GenBank Z47767, X63827, Skurnik et al., Mol. Microbiol.17: 575-594 (1995); Cyamopsis tetragonoloba: GenBank AJ005082;Pachysolen tannophilus: GenBank X68593, Skrzypek et al., Gene 140 (1),127-129 (1994); Azospirillum brasilense: GenBank Z25478, De Troch etal., Gene 144 (1), 143-144 (1994); Arabidopsis thaliana: GenBank Z54214,Dormann et al., Arch. Biochem. Biophys. 327: 27-34 (1996); Bacillussubtilis: GenBank X99339, Schrogel et al., FEMS Microbiol. Lett. 145:341-348 (1996); Rhizobium meliloti: GenBank X58126 S81948, Buendia etal., Mol. Biol. 5: 1519-1530 (1991); Rhizobium leguminosarum: GenBankX96507; Erwinia amylovora: GenBank X76172, Metzger et al., J. Bacteriol.176: 450-459 (1994); S. cerevisiae: GenBank X81324 (cluster of epimeraseand UDP-glucose pyrophosphorylase), Schaaff-Gerstenschlager, Yeast 11:79-83 (1995); Neisseria meningitidis: GenBank U19895, L20495, Lee etal., Infect. Immun. 63: 2508-2515 (1995), Jennings et al., Mol.Microbiol. 10: 361-369 (1993); and Pisum sativum: GenBank U31544.

[0081] Often, genes encoding enzymes that make up a pathway involved insynthesizing nucleotide sugars are found in a single operon or region ofchromosomal DNA. For example, the Xanthomonas campestrisphosphoglucomutase, phosphomannomutase, (xanA), phosphomannoseisomerase, and GDP-mannose pyrophosphorylase (xanB) genes are found on asingle contiguous nucleic acid fragment (Koeplin et al., J. Bacteriol.174, 191-199 (1992)). Klebsiella pneumoniae galactokinase,galactose-1-phosphate uridyltransferase, and UDP-galactose 4′-epimeraseare also found in a single operon (Peng et al. (1992) J. Biochem. 112:604-608). Many other examples are described in the references citedherein.

[0082] An alternative galactosyltransferase fusion polypeptide caninclude a catalytic domain from UDP-Gal pyrophosphorylase(galactose-1-phosphate uridyltransferase), which converts Gal-1-P toUDP-Gal. Genes that encode UDP-Gal pyrophosphorylase have beencharacterized for several organisms, including, for example, Rattusnorvegicus: GenBank L05541, Heidenreich et al., DNA Seq. 3: 311-318(1993); Lactobacillus casei: GenBank AF005933 (cluster of galactokinase(galK), UDP-galactose 4-epimerase (galE), galactose1-phosphate-uridyltransferase (galT)), Bettenbrock et al., Appl.Environ. Microbiol. 64: 2013-2019 (1998); E. coli: GenBank X06226 (galEand galT for UDP-galactose-4-epimerase and galactose-1-Puridyltransferase), Lemaire et al., Nucleic Acids Res. 14: 7705-7711(1986)); B. subtilis: GenBank Z99123 AL009126; Neisseria gonorrhoeae:GenBank Z50023, Ulkich et al., J. Bacteriol. 177: 6902-6909 (1995);Haemophilus influenzae: GenBank X65934 (cluster of galactose-1-phosphateuridyltransferase, galactokinase, mutarotase and galactose repressor),Maskell et al., Mol. Microbiol. 6: 3051-3063 (1992), GenBank M12348 andM12999, Tajima et al., Yeast 1: 67-77 (1985)); S. cerevisiae: GenBankX81324, Schaaff-Gerstenschlager et al., Yeast 11: 79-83 (1995); Musmusculus: GenBank U41282; human: GenBank M96264, M18731, Leslie et al.,Genomics 14: 474-480 (1992), Reichardt et al., Mol. Biol. Med. 5:107-122 (1988); Streptomyces lividans: M18953 (galactose 1-phosphateuridyltransferase, UDP-galactose 4-epimerase, and galactokinase), Adamset al., J. Bacteriol. 170: 203-212 (1988).

[0083] Catalytic domains of UDP-GlcNAc 4′ epimerase (UDP-GalNAc4′-epimerase)(EC 5.1.3.7), which catalyzes the conversion of UDP-GlcNActo UDP-GalNAc, and the reverse reaction, are also suitable for use inthe fusion polypeptides of the invention. Several loci that encode thisenzyme are described above. See also, U.S. Pat. No. 5,516,665.

[0084] Another example of a fusion polypeptide provided by the inventionis used for producing a fucosylated soluble oligosaccharide. The donornucleotide sugar for fucosyltransferases is GDP-fucose, which isrelatively expensive to produce. To reduce the cost of producing thefucosylated oligosaccharide, the invention provides fusion polypeptidesthat can convert the relatively inexpensive GDP-mannose into GDP-fucose,and then catalyze the transfer of the fucose to an acceptor saccharide.These fusion polypeptides include a catalytic domain from at least oneof a GDP-mannose dehydratase, a GDP-4-keto-6-deoxy-D-mannose3,5-epimerase, or a GDP-4-keto-6-deoxy-L-glucose 4-reductase. When eachof these enzyme activities is provided, one can convert GDP-mannose intoGDP-fucose.

[0085] The nucleotide sequence of an E. coli gene cluster that encodesGDP-fucose-synthesizing enzymes is described by Stevenson et al. (1996)J. Bacteriol. 178: 4885-4893; GenBank Accession No. U38473). This genecluster had been reported to include an open reading frame forGDP-mannose dehydratase (nucleotides 8633-9754; Stevenson et al.,supra.). It was recently discovered that this gene cluster also containsan open reading frame that encodes an enzyme that has both 3,5epimerization and 4-reductase activities (see, commonly assigned U.S.Provisional Patent Application No. 60/071,076, filed Jan. 15, 1998), andthus is capable of converting the product of the GDP-mannose dehydratasereaction (GDP-4-keto-6-deoxymannose) to GDP-fucose. This ORF, which isdesignated YEF B, is found between nucleotides 9757-10722. Prior to thisdiscovery that YEF B encodes an enzyme having two activities, it was notknown whether one or two enzymes were required for conversion ofGDP-4-keto-6-deoxymannose to GDP-fucose. The nucleotide sequence of agene encoding the human Fx enzyme is found in GenBank Accession No.U58766.

[0086] Also provided are fusion polypeptides that include amannosyltransferase catalytic domain and a catalytic domain of a GDP-Manpyrophosphorylase (EC 2.7.7.22), which converts Man-1-P to GDP-Man.Suitable genes are known from many organisms, including E. coli: GenBankU13629, AB010294, D43637 D13231, Bastin et al., Gene 164: 17-23 (1995),Sugiyama et al., J. Bacteriol. 180: 2775-2778 (1998), Sugiyama et al.,Microbiology 140 (Pt 1): 59-71 (1994), Kido et al., J. Bacteriol. 177:2178-2187 (1995); Klebsiella pneumoniae: GenBank AB010296, AB010295,Sugiyama et al., J. Bacteriol. 180: 2775-2778 (1998); Salmonellaenterica: GenBank X56793 M29713, Stevenson et al., J. Bacteriol. 178:4885-4893 (1996).

[0087] The fusion polypeptides of the invention for fucosylating asaccharide acceptor can also utilize enzymes that provide a minor or“scavenge” pathway for GDP-fucose formation. In this pathway, freefucose is phosphorylated by fucokinase to form fucose 1-phosphate,which, along with guanosine 5′-triphosphate (GTP), is used by GDP-fucosepyrophosphorylase to form GDP-fucose (Ginsburg et al., J. Biol. Chem.,236: 2389-2393 (1961) and Reitman, J. Biol. Chem., 255: 9900-9906(1980)). Accordingly, a fucosyltransferase catalytic domain can belinked to a catalytic domain from a GDP-fucose pyrophosphorylase, forwhich suitable nucleic acids are described in copending, commonlyassigned U.S. patent application Ser. No. 08/826,964, filed Apr. 9,1997. Fucokinase-encoding nucleic acids are described for, e.g.,Haemophilus influenzae (Fleischmann et al. (1995) Science 269:496-512)and E. coli (Lu and Lin (1989) Nucleic Acids Res. 17: 4883-4884).

[0088] Other pyrophosphorylases are known that convert a sugar phosphateinto a nucleotide sugar. For example, UDP-GalNAc pyrophosphorylasecatalyzes the conversion of GalNAc to UDP-GalNac. UDP-GlcNAcpyrophosphorylase (EC 2.7.7.23) converts GlcNAc-1-P to UDP-GlcNAc (B.subtilis: GenBank Z99104 AL009126, Kunst et al., supra.; Candidaalbicans: GenBank AB011003, Mio et al., J. Biol. Chem. 273 (23),14392-14397 (1998); Saccharomyces cerevisiae: GenBank AB011272, Mio etal., supra.; human: GenBank AB011004, Mio et al., supra.). These canalso be used in the fusion polypeptides of the invention.

[0089] The invention also provides fusion polypeptides that are usefulfor sialylation reactions. These fusion polypeptides include a catalyticdomain from a sialyltransferase and a catalytic domain from a CMP-sialicacid synthetase (EC 2.7.7.43, CMP-N-acetylneuraminic acid synthetase).Such genes are available from, for example, Mus musculus (GenBankAJ006215, Munster et al., Proc. Natl. Acad. Sci. U.S.A. 95: 9140-9145(1998)), rat (Rodriguez-Aparicio et al. (1992) J. Biol. Chem. 267:9257-63), Haemophilus ducreyi (Tullius et al. (1996) J. Biol. Chem. 271:15373-80), Neisseria meningitidis (Ganguli et al. (1994) J. Bacteriol.176: 4583-9), group B streptococci (Haft et al. (1994) J. Bacteriol.176: 7372-4), and E. coli (GenBank J05023, Zapata et al. (1989) J. Biol.Chem. 264: 14769-14774). Alternatively, fusion proteins for sialylationreactions can have a catalytic domain from either or both of GlcNAc 2′epimerase (EC 5.1.3.8), which converts GlcNAc to ManNAc, and neuraminicacid aldolase (EC 4.1.3.3; SwissProt Accession No. P06995), which inturn converts the ManNAc to sialic acid.

[0090] Additional accessory enzymes from which one can obtain acatalytic domain are those that are involved in forming reactantsconsumed in a glycosyltransferase cycle. For example, any of severalphosphate kinases are useful as accessory enzymes. Polyphosphate kinase(EC 2.7.4.1), for example, catalyzes the formation of ATP; nucleosidephosphate kinases (EC 2.7.4.4) can form the respective nucleosidediphosphates; creatine phosphate kinase (EC 2.7.3.2); myokinase (EC2.7.4.3); N-acetylglucosamine acetyl kinase (EC 2.7.1.59); acetylphosphate kinase; and pyruvate kinase (EC 2.7.1.40).

[0091] C. Cloning of Glycosyltransferase and Accessory Enzyme NucleicAcids

[0092] Nucleic acids that encode glycosyltransferases and accessoryenzymes, and methods of obtaining such nucleic acids, are known to thoseof skill in the art. Suitable nucleic acids (e.g., cDNA, genomic, orsubsequences (probes)) can be cloned, or amplified by in vitro methodssuch as the polymerase chain reaction (PCR), the ligase chain reaction(LCR), the transcription-based amplification system (TAS), theself-sustained sequence replication system (SSR). A wide variety ofcloning and in vitro amplification methodologies are well-known topersons of skill. Examples of these techniques and instructionssufficient to direct persons of skill through many cloning exercises arefound in Berger and Kimmel, Guide to Molecular Cloning Techniques,Methods in Enzymology 152 Academic Press, Inc., San Diego, Calif.(Berger); Sambrook et al. (1989) Molecular Cloning—A Laboratory Manual(2nd ed.) Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring HarborPress, N.Y., (Sambrook et al.); Current Protocols in Molecular Biology,F. M. Ausubel et al., eds., Current Protocols, a joint venture betweenGreene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1994Supplement) (Ausubel); Cashion et al., U.S. Pat. No. 5,017,478; andCarr, European Patent No. 0,246,864.

[0093] DNA that encodes glycosyltransferase and accessory enzymepolyeptides, or subsequences thereof, can be prepared by any suitablemethod described above, including, for example, cloning and restrictionof appropriate sequences. In one preferred embodiment, a nucleic acidencoding a glycosyltransferase or accessory enzyme can be isolated byroutine cloning methods. A nucleotide sequence of a glycosyltransferaseor accessory enzyme as provided in, for example, GenBank or othersequence database (see above) can be used to provide probes thatspecifically hybridize to a glycosyltransferase or accessory enzyme genein a genomic DNA sample, or to a glycosyltransferase or accessory enzymemRNA in a total RNA sample (e.g., in a Southern or Northern blot). Oncethe target glycosyltransferase or accessory enzyme nucleic acid isidentified, it can be isolated according to standard methods known tothose of skill in the art (see, e.g., Sambrook et al. (1989) MolecularCloning: A Laboratory Manual, 2nd Ed., Vols. 1-3, Cold Spring HarborLaboratory; Berger and Kimmel (1987) Methods in Enzymology, Vol. 152:Guide to Molecular Cloning Techniques, San Diego: Academic Press, Inc.;or Ausubel et al. (1987) Current Protocols in Molecular Biology, GreenePublishing and Wiley-Interscience, New York). Alternatively,subsequences can be cloned and the appropriate subsequences cleavedusing appropriate restriction enzymes. The fragments may then be ligatedto produce the desired DNA sequence.

[0094] A glycosyltransferase nucleic acid can also be cloned bydetecting its expressed product by means of assays based on thephysical, chemical, or immunological properties. For example, one canidentify a cloned glycosyltransferase nucleic acid by the ability of apolypeptide encoded by the nucleic acid to catalyze the transfer of amonosaccharide from a donor to an acceptor moiety. In a preferredmethod, capillary electrophoresis is employed to detect the reactionproducts. This highly sensitive assay involves using eithermonosaccharide or disaccharide aminophenyl derivatives which are labeledwith fluorescein as described in Wakarchuk et al. (1996) J. Biol. Chem.271 (45): 28271-276. For example, to assay for a Neisseria lgtC enzyme,either FCHASE-AP-Lac or FCHASE-AP-Gal can be used, whereas for theNeisseria lgtB enzyme an appropriate reagent is FCHASE-AP-GlcNAc (Id.).

[0095] As an alternative to cloning a glycosyltransferase or accessoryenzyme gene or cDNA, a glycosyltransferase nucleic acid can bechemically synthesized from a known sequence that encodes aglycosyltransferase. Suitable methods include the phosphotriester methodof Narang et al. (1979) Meth. Enzymol. 68: 90-99; the phosphodiestermethod of Brown et al. (1979) Meth. Enzymol. 68: 109-151; thediethylphosphoramidite method of Beaucage et al. (1981) Tetra. Lett.,22: 1859-1862; and the solid support method of U.S. Pat. No. 4,458,066.Chemical synthesis produces a single stranded oligonucleotide. This canbe converted into double stranded DNA by hybridization with acomplementary sequence, or by polymerization with a DNA polymerase usingthe single strand as a template. One of skill would recognize that whilechemical synthesis of DNA is often limited to sequences of about 100bases, longer sequences may be obtained by the ligation of shortersequences.

[0096] Glycosyltransferase and accessory enzyme nucleic acids can becloned using DNA amplification methods such as polymerase chain reaction(PCR). Thus, for example, the nucleic acid sequence or subsequence isPCR amplified, using a sense primer containing one restriction site(e.g., NdeI) and an antisense primer containing another restriction site(e.g., HindIII). This will produce a nucleic acid encoding the desiredglycosyltransferase or accessory enzyme sequence or subsequence andhaving terminal restriction sites. This nucleic acid can then be easilyligated into a vector containing a nucleic acid encoding the secondmolecule and having the appropriate corresponding restriction sites.Suitable PCR primers can be determined by one of skill in the art usingthe sequence information provided in GenBank or other sources.Appropriate restriction sites can also be added to the nucleic acidencoding the glycosyltransferase protein or protein subsequence bysite-directed mutagenesis. The plasmid containing theglycosyltransferase-encoding nucleotide sequence or subsequence iscleaved with the appropriate restriction endonuclease and then ligatedinto an appropriate vector for amplification and/or expression accordingto standard methods. Examples of techniques sufficient to direct personsof skill through in vitro amplification methods are found in Berger,Sambrook, and Ausubel, as well as Mullis et al., (1987) U.S. Pat. No.4,683,202; PCR Protocols A Guide to Methods and Applications (Innis etal., eds) Academic Press Inc. San Diego, Calif. (1990) (Innis); Arnheim& Levinson (Oct. 1, 1990) C&EN 36-47; The Journal Of NIH Research (1991)3: 81-94; (Kwoh et al. (1989) Proc. Natl. Acad. Sci. USA 86: 1173;Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA 87, 1874; Lomell etal. (1989) J. Clin. Chem., 35: 1826; Landegren et al., (1988) Science241: 1077-1080; Van Brunt (1990) Biotechnology 8: 291-294; Wu andWallace (1989) Gene 4: 560; and Barringer et al. (1990) Gene 89: 117.

[0097] Other physical properties of a polypeptide expressed from aparticular nucleic acid can be compared to properties of knownglycosyltransferases or accessory enzymes to provide another method ofidentifying suitable nucleic acids. Alternatively, a putativeglycosyltransferase or accessory enzyme gene can be~mutated, and itsrole as a glycosyltransferase or accessory enzyme established bydetecting a variation in the structure of an oligosaccharide normallyproduced by the glycosyltransferase or accessory enzyme.

[0098] In some embodiments, it may be desirable to modify theglycosyltransferase and/or accessory enzyme nucleic acids. One of skillwill recognize many ways of generating alterations in a given nucleicacid construct. Such well-known methods include site-directedmutagenesis, PCR amplification using degenerate oligonucleotides,exposure of cells containing the nucleic acid to mutagenic agents orradiation, chemical synthesis of a desired oligonucleotide (e.g., inconjunction with ligation and/or cloning to generate large nucleicacids) and other well-known techniques. See, e.g., Giliman and Smith(1979) Gene 8:81-97, Roberts et al. (1987) Nature 328: 731-734.

[0099] For example, the glycosyltransferase and/or accessory enzymenucleic acids can be modified to facilitate the linkage of the twodomains to obtain the polynucleotides that encode the fusionpolypeptides of the invention. Glycosyltransferase catalytic domains andaccessory enzyme catalytic domains that are modified by such methods arealso part of the invention. For example, codon for a cysteine residuecan be placed at either end of a domain so that the domain can be linkedby, for example, a sulfide linkage. The modification can be done usingeither recombinant or chemical methods (see, e.g., Pierce Chemical Co.catalog, Rockford Ill.). The glycosyltransferase and/or accessory enzymecatalytic domains are typically joined by linker domains, which aretypically polypeptide sequences, such as poly glycine sequences ofbetween about 5 and 200 amino acids, with between about 10-100. aminoacids-being typical. In some embodiments, proline residues areincorporated into the linker to prevent the formation of significantsecondary structural elements by the linker. Preferred linkers are oftenflexible amino acid subsequences which are synthesized as part of arecombinant fusion protein. In one embodiment, the flexible linker is anamino acid subsequence comprising a proline such as Gly(x)-Pro-Gly(x)where x is a number between about 3 and about 100. In other embodiments,a chemical linker is used to connect synthetically or recombinantlyproduced glycosyltransferase and accessory enzyme catalytic domains.Such flexible linkers are known to persons of skill in the art. Forexample, poly(ethylene glycol) linkers are available from ShearwaterPolymers, Inc. Huntsville, Ala. These linkers optionally have amidelinkages, sulfhydryl linkages, or heterofunctional linkages.

[0100] In a preferred embodiment, the recombinant nucleic acids presentin the cells of the invention are modified to provide preferred codonswhich enhance translation of the nucleic acid in a selected organism(e.g., yeast preferred codons are substituted into a coding nucleic acidfor expression in yeast).

[0101] D. Expression Cassettes and Host Cells for Expressing the FusionPolypeptides

[0102] Typically, the polynucleotide that encodes the fusion polypeptideis placed under the control of a promoter that is functional in thedesired host cell. An extremely wide variety of promoters are wellknown, and can be used in the expression vectors of the invention,depending on the particular application. Ordinarily, the promoterselected depends upon the cell in which the promoter is to be active.Other expression control sequences such as ribosome binding sites,transcription termination sites and the like are also optionallyincluded. Constructs that include one or more of these control sequencesare termed “expression cassettes.” Accordingly, the invention providesexpression cassettes into which the nucleic acids that encode fusionpolypeptides are incorporated for high level expression in a desiredhost cell.

[0103] Expression control sequences that are suitable for use in aparticular host cell are often obtained by cloning a gene that isexpressed in that cell. Commonly used prokaryotic control sequences,which are defined herein to include promoters for transcriptioninitiation, optionally with an operator, along with ribosome bindingsite sequences, include such commonly used promoters as thebeta-lactamase (penicillinase) and lactose (lac) promoter systems(Change et al., Nature (1977) 198: 1056), the tryptophan (trp) promotersystem (Goeddel et al., Nucleic Acids Res. (1980) 8: 4057), the tacpromoter (DeBoer, et al., Proc. Natl. Acad. Sci. U.S.A. (1983)80:21-25); and the lambda-derived P_(L) promoter and N-gene ribosomebinding site (Shimatake et al., Nature (1981) 292: 128). The particularpromoter system is not critical to the invention, any available promoterthat functions in prokaryotes can be used.

[0104] For expression of fusion polypeptides in prokaryotic cells otherthan E. coli, a promoter that functions in the particular prokaryoticspecies is required. Such promoters can be obtained from genes that havebeen cloned from the species, or heterologous promoters can be used. Forexample, the hybrid trp-lac promoter functions in Bacillus in additionto E. coli.

[0105] A ribosome binding site (RBS) is conveniently included in theexpression cassettes of the invention. An RBS in E. coli, for example,consists of a nucleotide sequence 3-9 nucleotides in length located 3-11nucleotides upstream of the initiation codon (Shine and Dalgarno, Nature(1975) 254: 34; Steitz, In Biological regulation and development: Geneexpression (ed. R. F. Goldberger), vol. 1, p. 349, 1979, PlenumPublishing, N.Y.).

[0106] For expression of the fusion polypeptides in yeast, convenientpromoters include GAL1-10 (Johnson and Davies (1984) Mol. Cell. Biol.4:1440-1448) ADH2 (Russell et al. (1983) J. Biol. Chem. 258:2674-2682),PHO5 (EMBO J. (1982) 6:675-680), and MFα (Herskowitz and Oshima (1982)in The Molecular Biology of the Yeast Saccharomyces (eds. Strathern,Jones, and Broach) Cold Spring Harbor Lab., Cold Spring Harbor, N.Y.,pp. 181-209). Another suitable promoter for use in yeast is theADH2/GAPDH hybrid promoter as described in Cousens et al., Gene61:265-275 (1987). For filamentous fungi such as, for example, strainsof the fungi Aspergillus (McKnight et al., U.S. Pat. No. 4,935,349),examples of useful promoters include those derived from Aspergillusnidulans glycolytic genes, such as the ADH3 promoter (McKnight et al.,EMBO J 4: 2093 2099 (1985)) and the tpiA promoter. An example of asuitable terminator is the ADH3 terminator (McKnight et al.).

[0107] Suitable constitutive promoters for use in plants include, forexample, the cauliflower mosaic virus (CaMV) 35S transcriptioninitiation region and region VI promoters, the 1′- or 2′-promoterderived from T-DNA of Agrobacterium tumefaciens, and other promotersactive in plant cells that are known to those of skill in the art. Othersuitable promoters include the full-length transcript promoter fromFigwort mosaic virus, actin promoters, histone promoters, tubulinpromoters, or the mannopine synthase promoter (MAS). Other constitutiveplant promoters include various ubiquitin or polyubiquitin promotersderived from, inter alia, Arabidopsis (Sun and Callis, Plant J.,11(5):1017-1027 (1997)), the mas, Mac or DoubleMac promoters (describedin U.S. Pat. No. 5,106,739 and by Comai et al., Plant Mol. Biol.15:373-381 (1990)) and other transcription initiation regions fromvarious plant genes known to those of skill in the art. Such genesinclude for example, ACT11 from Arabidopsis (Huang et al., Plant Mol.Biol. 33:125-139 (1996)), Cat3 from Arabidopsis (GenBank No. U43147,Zhong et al., Mol. Gen. Genet. 251:196-203 (1996)), the gene encodingstearoyl-acyl carrier protein desaturase from Brassica napus (GenbankNo. X74782, Solocombe et al., Plant Physiol. 104:1167-1176 (1994)), GPc1from maize (GenBank No. X15596, Martinez et al., J. Mol. Biol208:551-565 (1989)), and Gpc2 from maize (GenBank No. U45855, Manjunathet al., Plant Mol. Biol. 33:97-112 (1997)). Useful promoters for plantsalso include those obtained from Ti- or Ri-plasmids, from plant cells,plant viruses or other hosts where the promoters are found to befunctional in plants. Bacterial promoters that function in plants, andthus are suitable for use in the methods of the invention include theoctopine synthetase promoter, the nopaline synthase promoter, and themanopine synthetase promoter. Suitable endogenous plant promotersinclude the ribulose-1,6-biphosphate (RUBP) carboxylase small subunit(ssu) promoter, the (α-conglycinin promoter, the phaseolin promoter, theADH promoter, and heat-shock promoters.

[0108] Either constitutive or regulated promoters can be used in thepresent invention. Regulated promoters can be advantageous because thehost cells can be grown to high densities before expression of thefusion polypeptides is induced. High level expression of heterologousproteins slows cell growth in some situations. An inducible promoter isa promoter that directs expression of a gene where the level ofexpression is alterable by environmental or developmental factors suchas, for example, temperature, pH, anaerobic or aerobic conditions,light, transcription factors and chemicals. Such promoters are referredto herein as “inducible” promoters, which allow one to control thetiming of expression of the glycosyltransferase or enzyme involved innucleotide sugar synthesis. For E. coli and other bacterial host cells,inducible promoters are known to those of skill in the art. Theseinclude, for example, the lac promoter, the bacteriophage lambda P_(L)promoter, the hybrid trp-lac promoter (Amann et al. (1983) Gene 25: 167;de Boer et al. (1983) Proc. Nat'l Acad. Sci. USA 80: 21), and thebacteriophage T7 promoter (Studier et al. (1986) J. Mol. Biol.; Tabor etal. (1985) Proc. Nat'l. Acad. Sci. USA 82: 1074-8). These promoters andtheir use are discussed in Sambrook et al., supra. A particularlypreferred inducible promoter for expression in prokaryotes is a dualpromoter that includes a tac promoter component linked to a promotercomponent obtained from a gene or genes that encode enzymes involved ingalactose metabolism (e.g., a promoter from a UDP galactose 4-epimerasegene (galE)). The dual tac-gal promoter, which is described in PCTPatent Application Publ. No. WO98/20 111, provides a level of expressionthat is greater than that provided by either promoter alone.

[0109] Inducible promoters for use in plants are known to those of skillin the art (see, e.g., references cited in Kuhlemeier et al (1987) Ann.Rev. Plant Physiol. 38:221), and include those of the 1,5-ribulosebisphosphate carboxylase small subunit genes of Arabidopsis thaliana(the “ssu” promoter), which are light-inducible and active only inphotosynthetic tissue, anther-specific promoters (EP 344029), andseed-specific promoters of, for example, Arabidopsis thaliana (Krebberset al. (1988) Plant Physiol. 87:859).

[0110] Inducible promoters for other organisms are also well known tothose of skill in the art. These include, for example, the arabinosepromoter, the lacZ promoter, the metallothionein promoter, and the heatshock promoter, as well as many others.

[0111] A construct that includes a polynucleotide of interest operablylinked to gene expression control signals that, when placed in anappropriate host cell, drive expression of the polynucleotide is termedan “expression cassette.” Expression cassettes that encode the fusionpolypeptides of the invention are often placed in expression vectors forintroduction into the host cell. The vectors typically include, inaddition to an expression cassette, a nucleic acid sequence that enablesthe vector to replicate independently in one or more selected hostcells. Generally, this sequence is one that enables the vector toreplicate independently of the host chromosomal DNA, and includesorigins of replication or autonomously replicating sequences. Suchsequences are well known for a variety of bacteria. For instance, theorigin of replication from the plasmid pBR322 is suitable for mostGram-negative bacteria. Alternatively, the vector can replicate bybecoming integrated into the host cell genomic complement and beingreplicated as the cell undergoes DNA replication. A preferred expressionvector for expression of the enzymes is in bacterial cells is pTGK,which includes a dual tac-gal promoter and is described in PCT PatentApplication Publ. NO. WO98/20111.

[0112] The construction of polynucleotide constructs generally requiresthe use of vectors able to replicate in bacteria. A plethora of kits arecommercially available for the purification of plasmids from bacteria.For their proper use, follow the manufacturer's instructions (see, forexample, EasyPrepJ, FlexiPrepJ, both from Pharmacia Biotech;StrataCleanJ, from Stratagene; and, QIAexpress Expression System,Qiagen). The isolated and purified plasmids can then be furthermanipulated to produce other plasmids, and used to transfect cells.Cloning in Streptomyces or Bacillus is also possible.

[0113] Selectable markers are often incorporated into the expressionvectors used to express the polynucleotides of the invention. Thesegenes can encode a gene product, such as a protein, necessary for thesurvival or growth of transformed host cells grown in a selectiveculture medium. Host cells not transformed with the vector containingthe selection gene will not survive in the culture medium. Typicalselection genes encode proteins that confer resistance to antibiotics orother toxins, such as ampicillin, neomycin, kanamycin, chloramphenicol,or tetracycline. Alternatively, selectable markers may encode proteinsthat complement auxotrophic deficiencies or supply critical nutrientsnot available from complex media, e.g., the gene encoding D-alanineracemase for Bacilli. Often, the vector will have one selectable markerthat is functional in, e.g., E. coli, or other cells in which the vectoris replicated prior to being introduced into the host cell. A number ofselectable markers are known to those of skill in the art and aredescribed for instance in Sambrook et al., supra. A preferred selectablemarker for use in bacterial cells is a kanamycin resistance marker(Vieira and Messing, Gene 19: 259 (1982)). Use of kanamycin selection isadvantageous over, for example, ampicillin selection because ampicillinis quickly degraded by β-lactamase in culture medium, thus removingselective pressure and allowing the culture to become overgrown withcells that do not contain the vector.

[0114] Suitable selectable markers for use in mammalian cells include,for example, the dihydrofolate reductase gene (DHFR), the thymidinekinase gene (TK), or prokaryotic genes conferring drug resistance, gpt(xanthine-guanine phosphoribosyltransferase, which can be selected forwith mycophenolic acid; neo (neomycin phosphotransferase), which can beselected for with G418, hygromycin, or puromycin; and DHFR(dihydrofolate reductase), which can be selected for with methotrexate(Mulligan & Berg (1981) Proc. Nat'l. Acad. Sci. USA 78: 2072; Southern &Berg (1982) J. Mol. Appl. Genet. 1: 327).

[0115] Selection markers for plant and/or other eukaryotic cells oftenconfer resistance to a biocide or an antibiotic, such as, for example,kanamycin, G 418, bleomycin, hygromycin, or chloramphenicol, orherbicide resistance, such as resistance to chlorsulfuron or Basta.Examples of suitable coding sequences for selectable markers are: theneo gene which codes for the enzyme neomycin phosphotransferase whichconfers resistance to the antibiotic kanamycin (Beck et al (1982) Gene19:327); the hyg gene, which codes for the enzyme hygromycinphosphotransferase and confers resistance to the antibiotic hygromycin(Gritz and Davies (1983) Gene 25:179); and the bar gene (EP 242236) thatcodes for phosphinothricin acetyl transferase which confers resistanceto the herbicidal compounds phosphinothricin and bialaphos.

[0116] Construction of suitable vectors containing one or more of theabove listed components employs standard ligation techniques asdescribed in the references cited above. Isolated plasmids or DNAfragments are cleaved, tailored, and re-ligated in the form desired togenerate the plasmids required. To confirm correct sequences in plasmidsconstructed, the plasmids can be analyzed by standard techniques such asby restriction endonuclease digestion, and/or sequencing according toknown methods. Molecular cloning techniques to achieve these ends areknown in the art. A wide variety of cloning and in vitro amplificationmethods suitable for the construction of recombinant nucleic acids arewell-known to persons of skill. Examples of these techniques andinstructions sufficient to direct persons of skill through many cloningexercises are found in Berger and Kimmel, Guide to Molecular CloningTechniques, Methods in Enzymology, Volume 152, Academic Press, Inc., SanDiego, Calif. (Berger); and Current Protocols in Molecular Biology, F.M. Ausubel et al., eds., Current Protocols, a joint venture betweenGreene Publishing Associates, Inc. and John Wiley & Sons, Inc., (1998Supplement) (Ausubel).

[0117] A variety of common vectors suitable for use as startingmaterials for constructing the expression vectors of the invention arewell known in the art. For cloning in bacteria, common vectors includepBR322 derived vectors such as pBLUESCRIPT™, and λ-phage derivedvectors. In yeast, vectors include Yeast Integrating plasmids (e.g.,YIp5) and Yeast Replicating plasmids (the YRp series plasmids) andpGPD-2. Expression in mammalian cells can be achieved using a variety ofcommonly available plasmids, including pSV2, pBC12BI, and p91023, aswell as lytic virus vectors (e.g., vaccinia virus, adeno virus, andbaculovirus), episomal virus vectors (e.g., bovine papillomavirus), andretroviral vectors (e.g., murine retroviruses).

[0118] The methods for introducing the expression vectors into a chosenhost cell are not particularly critical, and such methods are known tothose of skill in the art. For example, the expression vectors can beintroduced into prokaryotic cells, including E. coli, by calciumchloride transformation, and into eukaryotic cells by calcium phosphatetreatment or electroporation. Other transformation methods are alsosuitable.

[0119] Translational coupling may be used to enhance expression. Thestrategy uses a short upstream open reading frame derived from a highlyexpressed gene native to the translational system, which is placeddownstream of the promoter, and a ribosome binding site followed after afew amino acid codons by a termination codon. Just prior to thetermination codon is a second ribosome binding site, and following thetermination codon is a start codon for the initiation of translation.The system dissolves secondary structure in the RNA, allowing for theefficient initiation of translation. See Squires, et. al. (1988), J.Biol. Chem. 263: 16297-16302.

[0120] The fusion polypeptides can be expressed intracellularly, or canbe secreted from the cell. Intracellular expression often results inhigh yields. If necessary, the amount of soluble, active fusionpolypeptide may be increased by performing refolding procedures (see,e.g., Sambrook et al., supra.; Marston et al., Bio/Technology (1984) 2:800; Schoner et al., Bio/Technology (1985) 3:151). In embodiments inwhich the fusion polypeptides are secreted from the cell, either intothe periplasm or into the extracellular medium, the DNA sequence islinked to a cleavable signal peptide sequence. The signal sequencedirects translocation of the fusion polypeptide through the cellmembrane. An example of a suitable vector for use in E. coli thatcontains a promoter-signal sequence unit is pTA1529, which has the E.coli phoA promoter and signal sequence (see, e.g., Sambrook et al.,supra.; Oka et al., Proc. Natl. Acad. Sci. USA (1985) 82: 7212; Talmadgeet al., Proc. Natl. Acad. Sci. USA (1980) 77: 3988; Takahara et al., J.Biol. Chem. (1985) 260: 2670).

[0121] The fusion polypeptides of the invention can also be furtherlinked to other bacterial proteins. This approach often results in highyields, because normal prokaryotic control sequences directtranscription and translation. In E. coli, lacZ fusions are often usedto express heterologous proteins. Suitable vectors are readilyavailable, such as the pUR, pEX, and pMR100 series (see, e.g., Sambrooket al., supra.). For certain applications, it may be desirable to cleavethe non-glycosyltransferase and/or accessory enzyme amino acids from thefusion protein after purification. This can be accomplished by any ofseveral methods known in the art, including cleavage by cyanogenbromide, a protease, or by Factor X_(a) (see, e.g., Sambrook et al.,supra.; Itakura et al., Science (1977) 198: 1056; Goeddel et al., Proc.Natl. Acad. Sci. USA (1979) 76: 106; Nagai et al., Nature (1984) 309:810; Sung et al., Proc. Natl. Acad. Sci. USA (1986) 83: 561). Cleavagesites can be engineered into the gene for the fusion protein at thedesired point of cleavage.

[0122] More than one fusion polypeptide may be expressed in a singlehost cell by placing multiple transcriptional cassettes in a singleexpression vector, or by utilizing different selectable markers for eachof the expression vectors which are employed in the cloning strategy.

[0123] A suitable system for obtaining recombinant proteins from E. coliwhich maintains the integrity of their N-termini has been described byMiller et al. Biotechnology 7:698-704 (1989). In this system, the geneof interest is produced as a C-terminal fusion to the first 76 residuesof the yeast ubiquitin gene containing a peptidase cleavage site.Cleavage at the junction of the two moieties results in production of aprotein having an intact authentic N-terminal reside.

[0124] Fusion polypeptides of the invention can be expressed in avariety of host cells, including E. coli, other bacterial hosts, yeast,and various higher eukaryotic cells such as the COS, CHO and HeLa cellslines and myeloma cell lines. The host cells can be mammalian cells,plant cells, or microorganisms, such as, for example, yeast cells,bacterial cells, or fungal cells. Examples of suitable host cellsinclude, for example, Azotobacter sp. (e.g., A. vinelandii), Pseudomonassp., Rhizobium sp., Erwinia sp., Escherichia sp. (e.g., E. coli),Bacillus, Pseudomonas, Proteus, Salmonella, Serratia, Shigella,Rhizobia, Vitreoscilla, Paracoccus and Klebsiella sp., among manyothers. The cells can be of any of several genera, includingSaccharomyces (e.g., S. cerevisiae), Candida (e.g., C. utilis, C.parapsilosis, C. krusei, C. versatilis, C. lipolytica, C. zeylanoides,C. guilliermondii, C. albicans, and C. humicola), Pichia (e.g., P.farinosa and P. ohmeri), Torulopsis (e.g., T. candida, T. sphaerica, T.xylinus, T. famata, and T. versatilis), Debaryomyces (e.g., D.subglobosus, D. cantarellii, D. globosus, D. hansenii, and D.japonicus), Zygosaccharomyces (e.g., Z. rouxii and Z. bailii),Kluyveromyces (e.g., K. marxianus), Hansenula (e.g., H. anomala and H.jadinii), and Brettanomyces (e.g., B. lambicus and B. anomalus).Examples of useful bacteria include, but are not limited to,Escherichia, Enterobacter, Azotobacter, Erwinia, Klebsielia,.

[0125] The expression vectors of the invention can be transferred intothe chosen host cell by well-known methods such as calcium chloridetransformation for E. coli and calcium phosphate treatment orelectroporation for mammalian cells. Cells transformed by the plasmidscan be selected by resistance to antibiotics conferred by genescontained on the plasmids, such as the amp, gpt, neo and hyg genes.

[0126] In preferred embodiments, fusion polypeptides that compriseeukaryotic glycosyltransferase and accessory enzyme catalytic domainsare expressed in eukaryotic host cells. Similarly, fusion polypeptidesthat comprise prokaryotic catalytic domains are preferably expressed inprokaryotic cells. Alternatively, one can express a mammalian fusionpolypeptide in a prokaryotic host cell (see, e.g., Fang et al. (1998) J.Am. Chem. Soc. 120: 6635-6638), or vice versa.

[0127] Once expressed, the recombinant fusion polypeptides can bepurified according to standard procedures of the art, including ammoniumsulfate precipitation, affinity columns, column chromatography, gelelectrophoresis and the like (see, generally, R. Scopes, ProteinPurification, Springer-Verlag, N.Y. (1982), Deutscher, Methods inEnzymology Vol. 182: Guide to Protein Purification., Academic Press,Inc. N.Y. (1990)). Substantially pure compositions of at least about 90to 95% homogeneity are preferred, and 98 to 99% or more homogeneity aremost preferred. Once purified, partially or to homogeneity as desired,the polypeptides may then be used (e.g., as immunogens for antibodyproduction).

[0128] To facilitate purification of the fusion polypeptides of theinvention, the nucleic acids that encode the fusion polypeptides canalso include a coding sequence for an epitope or “tag” for which anaffinity binding reagent is available. Examples of suitable epitopesinclude the myc and V-5 reporter genes; expression vectors useful forrecombinant production of fusion polypeptides having these epitopes arecommercially available (e.g., Invitrogen (Carlsbad Calif.) vectorspcDNA3.1/Myc-His and pcDNA3.1V5-His are suitable for expression inmammalian cells). Additional expression vectors suitable for attaching atag to the fusion proteins of the invention, and corresponding detectionsystems are known to those of skill in the art, and several arecommercially available (e.g., FLAG″ (Kodak, Rochester N.Y.). Anotherexample of a suitable tag is a polyhistidine sequence, which is capableof binding to metal chelate affinity ligands. Typically, six adjacenthistidines are used, although one can use more or less than six.Suitable metal chelate affinity ligands that can serve as the bindingmoiety for a polyhistidine tag include nitrilo-tri-acetic acid (NTA)(Hochuli, E. (1990) “Purification of recombinant proteins with metalchelating adsorbents” In Genetic Engineering: Principles and Methods, J.K. Setlow, Ed., Plenum Press, N.Y.; commercially available from Qiagen(Santa Clarita, Calif.)).

[0129] Other haptens that are suitable for use as tags are known tothose of skill in the art and are described, for example, in theHandbook of Fluorescent Probes and Research Chemicals (6th Ed.,Molecular Probes, Inc., Eugene Oreg.). For example, dinitrophenol (DNP),digoxigenin, barbiturates (see, e.g., U.S. Pat. No. 5,414,085), andseveral types of fluorophores are useful as haptens, as are derivativesof these compounds. Kits are commercially available for linking haptensand other moieties to proteins and other molecules. For example, wherethe hapten includes a thiol, a heterobifunctional linker such as SMCCcan be used to attach the tag to lysine residues present on the capturereagent.

[0130] One of skill would recognize that modifications can be made tothe glycosyltransferase and accessory enzyme catalytic domains withoutdiminishing their biological activity. Some modifications may be made tofacilitate the cloning, expression, or incorporation of the catalyticdomain into a fusion protein. Such modifications are well known to thoseof skill in the art and include, for example, the addition of codons ateither terminus of the polynucleotide that encodes the catalytic domainto provide, for example, a methionine added at the amino terminus toprovide an initiation site, or additional amino acids (e.g., poly His)placed on either terminus to create conveniently located restrictionsites or termination codons or purification sequences.

[0131] E. Uses of the Fusion Polypeptides

[0132] The invention provides methods of using fusion polypeptidesproduced using the methods described herein to prepare desiredoligosaccharides (which are composed of two or more saccharides). Theglycosyltransferase reactions of the invention take place in a reactionmedium comprising at least one glycosyltransferase, an acceptor sugarand typically a soluble divalent metal cation. Substrates for theaccessory enzyme catalytic moiety are also present, so that theaccessory enzyme can synthesize the donor moiety for theglycosyltransferase. The methods rely on the use of aglycosyltransferase to catalyze the addition of a saccharide to asubstrate saccharide. For example, the invention provides methods foradding sialic acid to a galactose residue in an α2,3 linkage, bycontacting a reaction mixture that includes an acceptor moietycomprising a Gal residue in the presence of anα2,3-sialyltransferase/CMP-NeuAc synthetase fusion polypeptide that hasbeen prepared according to the methods described herein. The reactionmixture also includes sialic acid and CTP, as well as other ingredientsnecessary for activity of the sialyltransferase and the CMP-NeuAcsynthetase.

[0133] A number of methods of using glycosyltransferases to synthesizedesired oligosaccharide structures are known. Exemplary methods aredescribed, for instance, WO 96/32491, Ito et al. (1993) Pure Appl. Chem.65: 753, and U.S. Pat. Nos. 5,352,670, 5,374,541, and 5,545,553.

[0134] The fusion polypeptides prepared as described herein can be usedin combination with additional glycosyltransferases. For example, onecan use a combination of sialyltransferase fusion polypeptide and agalactosyltransferase, which may or may not be part of a fusionpolypeptide. In this group of embodiments, the enzymes and substratescan be combined in an initial reaction mixture, or preferably theenzymes and reagents for a second glycosyltransferase reaction can beadded to the reaction medium once the first glycosyltransferase reactionhas neared completion. By conducting two glycosyltransferase reactionsin sequence in a single vessel, overall yields are improved overprocedures in which an intermediate species is isolated. Moreover,cleanup and disposal of extra solvents and by-products is reduced.

[0135] The products produced by the above processes can be used withoutpurification. However, it is usually preferred to recover the product.Standard, well known techniques for recovery of glycosylated saccharidessuch as thin or thick layer chromatography, ion exchange chromatography,or membrane filtration can be used. It is preferred to use membranefiltration, more preferably utilizing a nanofiltration or reverseosmotic membrane as described in commonly assigned U.S. patentapplication Ser. No. 08/947,775, filed Oct. 9, 1997. For instance,membrane filtration wherein the membranes have molecular weight cutoffof about 1000 to about 10,000 can be used to remove proteins.Nanofiltration or reverse osmosis can then be used to remove salts.Nanofilter membranes are a class of reverse osmosis membranes which passmonovalent salts but retain polyvalent salts and uncharged soluteslarger than about 200 to about 1000 Daltons, depending upon the membraneused. Thus, in a typical application, the oligosaccharides of theinvention will be retained in the membrane and contaminating salts willpass through.

EXAMPLES

[0136] The following examples are offered to illustrate, but not tolimit the present invention.

Example 1

[0137] Construction of a CMP-Neu5Ac Synthetase/α2,3-SialyltransferaseFusion Protein

[0138] This Example describes the construction and expression of apolynucleotide that encodes a fusion protein that has both CMP-Neu5Acsynthetase activity and α2,3-sialyltransferase activity. Large-scaleenzymatic synthesis of oligosaccharides containing terminalN-acetyl-neuraminic acid residues requires large amounts of thesialyltransferase and the corresponding sugar-nucleotide synthetase forthe synthesis of the sugar-nucleotide donor, CMP-Neu5Ac, an unstablecompound. Using genes cloned from Neisseria meningitidis, we constructeda fusion protein which has both CMP-Neu5Ac synthetase andα-2,3-sialyltransferase activities. The fusion protein was produced inhigh yields (over 1,200 units per liter, measured using anα-2,3-sialyltransferase assay) in Escherichia coli and functionally pureenzyme could be obtained using a simple protocol. In small-scaleenzymatic syntheses, we showed that the fusion protein could sialylatevarious oligosaccharide acceptors (branched and linear) withN-acetyl-neuraminic acid as well as N-glycolyl- andN-propionyl-neuraminic acid in high conversion yield. The fusion proteinwas also used to produce α-2,3-sialyllactose at the 100 g scale using asugar nucleotide cycle reaction, starting from lactose, sialic acid,phosphoenolpyruvate and catalytic amounts of ATP and CMP.

[0139] Previously we reported the cloning and over-expression inEscherichia coli of both the CMP-Neu5Ac synthetase (Gilbert et al.(1997) Biotechnol. Lett. 19: 417-420) and the α-2,3-sialyltransferase(Gilbert et al. (1996) J. Biol. Chem. 271: 28271-28276; Gilbert et al.(1997) Eur. J. Biochem. 249: 187-194) from Neisseria meningitidis. Thetwo enzymes were used together to synthesize milligram amounts ofsialyllactose, sialyl-N-acetyllactosamine and sialyl-P^(k)(Neu5Ac-α-(2→3)-Gal-α-(1→4)-Gal-β-(1→4)-Glc). The CMP-Neu5Ac synthetasecan also be used to produce CMP derivatives of sialic acid analogs inorder to synthesize the corresponding sialo-oligosaccharide analogs(Id.).

[0140] Although we obtained a high yield (750 U/L) of theα-2,3-sialyltransferase in E. coli (Id.), the purified enzyme wasrelatively insoluble and had a tendency to precipitate and lose activityduring storage. Since the CMP-Neu5Ac synthetase was necessary forsynthesis purposes and was a soluble enzyme, we decided to make a fusedform of these two enzymes to see if it would be more soluble than theindividual α-2,3-sialyltransferase. The following two reactions wouldtherefore be catalyzed by the same polypeptide:

[0141] The fused form of these enzymes would also be kineticallyfavorable since the CMP-Neu5Ac synthetase has a turnover number (Gilbertet al. (1997) Biotechnol. Lett. 19: 417-420) of 31.4 sec⁻¹ while theα-2,3-sialyltransferase has turnover numbers ranging from 0.1 to 1.4sec⁻¹, depending on the acceptor (Gilbert et al. (1997) Eur. J. Biochem.249: 187-194 and unpublished data). The fused form would have theadditional benefit of reducing enzyme production costs by having asingle culture to grow and a single product to purify to obtain the twoactivities.

[0142] Materials and Methods

[0143] Construction of the fusion CMP-Neu5Acsynthetase/α-2,3-sialyltransferase.

[0144] PCR was performed with Pwo polymerase as described by themanufacturer (Boehringer Mannheim, Laval, Que.). The NeisseriaCMP-Neu5Ac synthetase was amplified using SYNTM-F1 as the 5′ primer (41mer: 5′-CTTAGGAGGTCATATGGAA AAACAAAATATTGCGGTTATAC-3′ (SEQ ID NO: 3);the NdeI site is in italics) and SYNTM-R6 as the 3′ primer (45-mer:5′-CGACAGAAITCCGCCACCGCTTTCCTT GTGATTAAGAATGTTTTC-3′ (SEQ ID NO: 4); theEcoRI site is in italics) and pNSY-01 (Gilbert et al. (1997) Biotechnol.Lett. 19: 417-420) as the template.

[0145] The Neisseria α-2,3-sialyltransferase was amplified usingSIALM-22F as the 5′ primer (37-mer:5′-GCATGGAAITCTGGGCTTGAAAAAGGCTTGTTTGACC-3′ (SEQ ID NO: 5); the EcoRIsite is in italics) and SIALM-23R as the 3′ primer (59-mer:5′-CCTAGGTCGACTCATTAGTGGTGATGGTGGTGATGGTTCAGGTCTTCTTCGCTGATCAG-3′ (SEQID NO: 6); the SalI site is in italics, the 6-His tail is underlined andthe c-myc tag is in bold) and using pNST-09 (Gilbert et al. (1996) J.Biol. Chem. 271: 28271-28276) as the template. The plasmid pFUS-01 wasconstructed by digesting the CMP-Neu5Ac synthetase PCR product with NdeIand EcoRI and the α-2,3-sialyltransferase PCR product with EcoRI andSalI and cloning them in a modified version of pCWori+(Gilbert et al.(1997) Eur. J. Biochem. 249: 187-194), in which the lacZα gene fragmenthas been deleted.

[0146] Expression in E. coli and Purification of the Fusion Protein

[0147] The initial screening of pFUS-01 versions was done using E. coliBMH71-18 as the host. For the large-scale production of the fusionprotein we used E. coli AD202 (CGSC #7297). A 21 L culture of E. coliAD202/pFUS-01/2 was grown in a 28-L New Brunswick Scientific (Edison,N.J.) fermenter (model MF 128S) as described previously (Gilbert et al.(1997) Eur. J. Biochem. 249: 187-194). The cells were resuspended in 50mM Hepes pH 7 at a ratio of 20 g of wet cell paste for 80 mL of buffer.Cell extracts were prepared using an Avestin C5 Emulsiflex celldisrupter (Avestin, Ottawa, Ont.). Polyethylene glycol (averagemolecular weight 8,000 Da) and NaCl were added to 4% and 0.2 M,respectively, and the cell extract was stirred 20 min at 4° C. Theextract was centrifuged 20 min at 8000 rpm and the pellet was washedtwice with 50 mM Hepes pH 7, 0.2 M NaCl, 4% PEG. The pellet wasresuspended with 50 mM Tris, pH 7.5, 1 mM EDTA and Triton X-100 (reducedand peroxide-free) was added to 1% v/v. The resuspended pellet wasstirred 30 min at 4° C. and then clarified by centrifugation for 1 h at13,000×g. The supernatant was applied to two 5-mL HiTrap Chelatingcolumn (Pharmacia Biotech, Uppsala, Sweden) charged with Ni²⁺, themaximum load being 25 mg total protein in each run. The columns weredeveloped with a 60-800 mM imidazole gradient in 10 mM Hepes (pH 7)containing 0.5 M NaCl and 0.2% Triton X-100.

[0148] Assays

[0149] Protein concentration was determined using the bicinchoninic acidprotein assay kit from Pierce (Rockford, Ill.). For all of the enzymaticassays one unit of activity was defined as the amount of enzyme thatgenerated one μmol of product per minute. The CMP-Neu5Ac synthetaseactivity was assayed at 37° C. using 3 mM Neu5Ac, 3 mM CTP, 100 mM TrispH 8.5, 0.2 mM DTT and 10 mM MgCl₂ in a final volume of 50 μL. Thereaction was stopped after 10 min by adding EDTA to 20 mM finalconcentration and the reaction mixture was analyzed by capillaryelectrophoresis performed with a Beckman Instruments (Fullerton, Calif.)P/ACE 5510 equipped with a P/ACE diode array detector set at 271 nm andusing the separation conditions described previously (Gilbert et al.(1997) Biotechnol. Lett. 19: 417-420).

[0150] All acceptors were synthesized as previously described (Gilbertet al. (1997) Eur. J. Biochem. 249: 187-194; Wakarchuk et al. (1996) J.Biol. Chem. 271: 19166-19173) with the exception that FEX (#F-6130,Molecular Probes, Eugene, Oreg.) was used in place of FCHASE for theLacNAc acceptor.

[0151] The α-2,3-sialyltransferase activity was assayed at 37° C. using0.5 mM LacNAc-FEX, 0.2 mM CMP-Neu5Ac, 50 mM Mes pH 6.0, 10 mM MnCl₂ in afinal volume of 10 μL. After 5 min the reactions were terminated bydilution with 10 mM NaOH and analyzed by capillary electrophoresisperformed using the separation conditions described previously (Gilbertet al. (1997) Eur. J. Biochem. 249: 187-194).

[0152] The coupled assay was performed using similar conditions exceptthat the incubation time was 10 min and the reaction mixture included0.5 mM LacNAc-FEX, 3 mM CTP, 3 mM Neu5Ac, 100 mM Tris pH 7.5, 0.2 mM DTTand 10 mM MgCl₂. The same reagent concentrations were used when thealternate acceptors (Lac-FCHASE and P^(k)-FCHASE) or the alternatedonors (Neu5Gc and Neu5Pr) were tested, except the reaction times were60 to 120 min.

[0153] Sialylation of a biantennary acceptor was performed using 1 mg ofGal-β-(1→4)-GlcNAc-β-(1→2)-Man-α-(1→6)-[Gal-β-(1→4)-GlcNAc-β-(1→2)-Man-α-(1→3)-]-Man-β(1→4)-GlcNAc-β-(1→4)-GlcNAcin a 90 min reaction. Reaction progress was monitored by TLC usingisopropanol/H₂O/ammonium hydroxide (6:3:1) to develop the plate and thesialylated product was purified by gel filtration on Bio-Gel P-4(Bio-Rad Lab., Hercules, Calif.). The mass of the isolated compound wasdetermined by mass spectrometry (negative ion mode).

[0154] Use in a 100 g Scale Synthesis

[0155] The reaction was performed in a total volume of 2.2 L and thefollowing reagents were added sequentially: lactose monohydrate (59.4 g,0.165 mol), phospho-enolpyruvate monopotassium salt (34 g, 0.165 mol),bovine serum albumin (2.2 g), sialic acid (51 g, 0.165 mol), CMP (2.84g, 8.79 mmol), ATP (0.532 g, 0.879 mmol) and sodium azide (0.44 g). ThepH was adjusted to 7.4 with NaOH and MnCl₂ was added to a finalconcentration of 30 mM. The reaction was allowed to proceed at roomtemperature after the addition of 13,200 units of myokinase (BoehringerMannheim), 19,800 units of pyruvate kinase (Boehringer Mannheim) and 820units (based on α-2,3-sialyltransferase activity) of fusion proteinobtained by extraction with Triton X-100 of the PEG/NaCl precipitate.Reaction progress was monitored daily by TLC usingisopropanol/H₂O/ammonium hydroxide (7:2:1) to develop the plate andorcinol/sulfuric acid followed by heating to visualize the product. Mn²⁺was monitored daily by ion chromatography and the reaction mixture wassupplemented with 1M MnCl₂ to maintain a final concentration of 30 mM.Supplementary phosphoenolpyruvate was added after two days (0.165 mol)and four days (0.055 mol).

[0156] After a total reaction time of 6 days, the crudeα-2,3-sialyllactose solution was filtered through two sheets of Whatmanfilter paper to remove the precipitate producing a yellow filtrate.Proteins were then removed by tangential flow ultrafiltration using a3,MWCO membrane (#P2PLBCC01, Millipore, Bedford, Mass.), providing aclear yellow solution. Triton X-100 was removed from the reactionmixture by filtration through a column containing 500 g of C18 reversephase resin. The eluate was then further purified using a nanofiltrationmachine (#19T-SSXYC-PES-316-SP, Osmonics, Minnetonka, Minn.) fitted witha spiral wound membrane (#GE2540C1076) and using two different pH's. ThepH of the solution was first adjusted with concentrated HCl to pH=3.0,and the feed solution was recirculated for 10 hours while maintainingthe total volume of the feed by continuous addition of deionized water.When the conductivity of the permeate solution reached 22 mS, the pH wasadjusted to pH=7.0 with 50% NaOH. Recirculation of this solution whilemaintaining the feed volume with deionized water was performed for anadditional 2 hours. The feed solution was concentrated to 800 mL and wasthen treated with AG50WX8 (H+) Dowex resin until a pH of 2.0 wasreached. After removing the resin by filtration, the pH was adjusted to7.0 with NaOH and the solution was decolorized by passing throughactivated charcoal. The solution was finally lyophilized to yield awhite powder and the α-2,3-sialyllactose content was determined by 1HNMR analysis in D₂O using 1,2-isopropylidene-α-D-glucofuranose as thereference standard.

[0157] Results

[0158] Construction of the fusion CMP-Neu5Acsynthetase/α-2,3-sialyltransferase

[0159] The Neisseria CMP-Neu5Ac synthetase was amplified by PCR, usingprimers that included a NdeI site (5′) and an EcoRI site (3′), while theNeisseria α-2,3-sialyltransferase was amplified using primers thatincluded an EcoRI site (5′) and a SalI site (3′). The two PCR productswere cloned together in a modified version of pCWori+(Gilbert et al.(1997) Eur. J. Biochem. 249: 187-194) that was digested with NdeI andSalI. In the resulting construct (pFUS-01) the start codon of theCMP-Neu5Ac synthetase was downstream of the three sequentialIPTG-inducible promoters and the ribosome binding site present inpCWori+. The α-2,3-sialyltransferase was linked to the C-terminal of theCMP-Neu5Ac synthetase through a 4-residue peptide linker(Gly-Gly-Gly-Ile) and the C-terminus of the fusion protein includes ac-Myc epitope tag for immuno-detection and a His₆ tail for purificationby immobilized metal affinity chromatography (IMAC). In the process ofcloning pFUS-01 we also obtained 2 clones that included additionalresidues in the linker regions. The linker of pFUS-01/2 (see FIG. 1) is9 residues long (Gly-Gly-Gly-Ile-Leu-Ser-His-Gly-Ile; SEQ ID NO: 7)while the linker of pFUS-01/4 is 8 residues long(Gly-Gly-Gly-Ile-Leu-Ser-Gly-Ile; SEQ ID NO: 8). Analysis by DNAsequencing of the two versions with additional residues suggested thatthey were cloning artifacts due to incomplete restriction enzymedigestion of the PCR products.

[0160] Expression in E. coli and Purification of the Fusion Protein.

[0161]E. coli BMH71-18 was transformed with the three versions ofpFUS-01 and the level of α-2,3-sialyltransferase activity was comparedin small-scale cultures (20 mL). The highest activity was obtained withpFUS-01/2, which gave 40% more activity than pFUS-01/4 and 60% moreactivity than pFUS-01. The fusion protein encoded by pFUS-01/2 has thelongest linker which might aid the independent folding of the twocomponents. However, the effects of linker composition and length werenot further studied and pFUS-01/2 was used for the scale-up inproduction and kinetics comparison.

[0162] Since we had observed an OmpT-catalyzed degradation whenpFUS-01/2 was expressed in E. coli BMH71-18 (data not shown) we used anompT-deficient host strain (E. coli AD202) for expression. In a 21 Lculture of E. coli AD202/pFUS-01/2, we measured a production of 1,200 Uper liter using an assay for α-2,3-sialyltransferase activity, 11,500 Uper liter using an assay for CMP-Neu5Ac synthetase activity and 300 Uper liter using a coupled CMP-Neu5Ac synthetase/α-2,3-sialyltransferaseassay. SDS-PAGE analysis indicated that a band with the expectedmolecular mass (70.2 kDa) of the fusion enzyme was predominant in theextract. The activity was associated with the insoluble fraction of theextract since over 95% of the activity was recovered in the pellet whenthe extract was centrifuged at 100,000×g for 1 hour. This situation wassimilar to what we observed with the separate α-2,3-sialyltransferasewhen it was over-expressed in E. coli (Id.). The α-2,3-sialyltransferaseis membrane bound in N. meningitidis (Gilbert et al. (1996) J. Biol.Chem. 271: 28271-28276) and it is not surprising that, whenover-expressed separately or as a fusion protein in E. coli, part of itwas associated with the membranes and/or cell debris.

[0163] In order to avoid large-scale ultracentrifugation, we developed aprecipitation strategy to recover the activity associated with theinsoluble fraction at a lower centrifugation speed (12,000×g).Precipitation with 4% polyethylene glycol (PEG 8000) and 0.2 M NaClafforded over 95% recovery of activity in the pellet, with a 1.8 foldincrease in specific activity between the crude extract (0.32 U/mg) andthe PEG/NaCl precipitate (0.58 U/mg). The pellet was washed with buffercontaining PEG/NaCl in order to remove traces of soluble (cytosolic)enzymes such as hydrolases that could degrade essential co-factors andsubstrates used in the enzymatic synthesis of target oligosaccharides.Although the washing steps reduced slightly the enzyme recovery, it wasessential to obtain functionally pure fusion protein.

[0164] The PEG/NaCl precipitate was extracted with 1% Triton X-100 inorder to solubilize the activity. We recovered 60-70% of the enzymeactivity in the soluble fraction which represented a 40-55% yield whencompared with the activity present in the total extract and a 3 foldincrease in specific activity (1 U/mg). The material extracted withTriton X-100 from the PEG/NaCl precipitate was stable for at least amonth at 4° C. and was used in the synthesis reactions described below.

[0165] Immobilized metal affinity chromatography (IMAC) was performed onthe Triton X-100 extract and the fusion protein appeared in thefractions eluting between 400 and 550 mM imidazole. The purified fusionprotein had a specific activity of 1-2 U/mg and the overall purificationyield was below 5%. Analysis of the purified protein by SDS-PAGE showedthat it was at least 90% pure.

[0166] Comparison of the Fusion Protein With the Individual Enzymes

[0167] This comparison was made difficult by the fact that the enzymesdiffer widely in their solubility and tendency to aggregate whenpurified to homogeneity. We observed previously that the CMP-Neu5Acsynthetase was soluble to above 20 mg/mL (Gilbert et al. (1997)Biotechnol. Lett. 19: 417-420) while the α-2,3-sialyltransferaseprecipitated when attempts were made to concentrate it above 1 mg/mL,even in the presence of detergent (Gilbert et al. (1997) Eur. J.Biochem. 249: 187-194). The IMAC-purified fusion protein was soluble toabout 5 mg/mL in the presence of 0.2% Triton X-100. Using theα-2,3-sialyltransferase assay we found specific activities in the rangeof 1 to 1.5 U/mg for different batches of the purified separateα-2,3-sialyltransferase and 1 to 2 U/mg for different batches of thepurified fusion protein. A tendency to aggregate might explain therelatively large variation in specific activity between differentbatches of IMAC-purified fusion protein.

[0168] Previously we observed that partially purifiedα-2,3-sialyltransferase could be extracted with Triton X-100 frommembrane fractions obtained by ultracentrifugation (Id.). This procedureis similar to the extraction of the fusion protein from the PEG/NaClprecipitate but the extraction from the membranes yielded purermaterial. Such preparations of both the fusion protein and the separateα-2,3-sialyltransferase were more stable than the IMAC-purifiedmaterial, but since the enzyme was not homogeneous the proteinconcentration was estimated by scanning densitometry of SDS-PAGE gels.Using this procedure we observed a specific activity of 2.0 U/mg for theseparate α-2,3-sialyltransferase and 2.7 U/mg for the fusion protein.When taking into account the molecular masses of these two proteins, wecalculated turnover numbers of 1.4 sec⁻¹ for the separateα-2,3-sialyltransferase and 3.2 sec⁻¹ for the fusion enzyme. Given thedifferent solubility properties of these two proteins, it is difficultto conclude if there is any real catalytic improvement of theα-2,3-sialyltransferase when it is in the fused form or if it is simplymore stable under the assay conditions. On the other hand, theCMP-Neu5Ac synthetase turnover number of the fused form was comparableto the turnover number of the separate CMP-Neu5Ac synthetase (39.5 sec⁻¹and 31.4 sec⁻¹, respectively).

[0169] Small Scale Syntheses with Various Donors and Acceptors

[0170] The ability of the fusion protein to use different donors andacceptors was tested in analytical (5 nmol) coupled reactions performedat pH 7.5 which is intermediate between the optimal pH of theα-2,3-sialyltransferase (pH 6) (Gilbert et al. (1996) J. Biol. Chem.271: 28271-28276) and the optimal pH of the CMP-Neu5Ac synthetase (pH8.5) (Warren and Blacklow (1962) J. Biol. Chem. 237: 3527-3534). Thefusion protein could sialylate N-acetyllactosamine-FEX andlactose-FCHASE with N-acetyl-neuraminic acid as well as the N-propionyl-and N-glycolyl-analogs in yields that exceeded 97% in 1 hour (Table 1).Both N-acetyl-lactosamine-FEX and lactose-FCHASE have a terminal β-Galwhich is the natural acceptor for the Neisseria α-2,3-sialyltransferase(Gilbert et al. (1997) Eur. J. Biochem. 249: 187-194). TABLE 1Small-scale syntheses using the fusion CMP-Neu5Ac synthetase/α-2,3-sialyltransferase with various donors and acceptors (% conversion tosialylated product). Donor^(a) Acceptor Neu5Ac Neu5Pr Neu5GcGal-β-(1→4)-GlcNAc-β^(b) >99 >99 >99 (60 min reaction)Gal-β-(1→4)-Glc-β^(c) >99 97 97 (60 min reaction)Gal-α-(1→4)-Gal-β-(1→4)-β-Glc-β^(c) 84 84 55 (120 min reaction)Biantennary N-linked type^(d) >99 ND^(e) ND ^(a)Neu5Ac =N-acetyl-neuraminic acid Neu5Pr = N-propionyl-neuraminic acid Neu5Gc =N-glycolyl-neuraminic acid ^(b)This acceptor was aFEX-aminophenyl-glycoside derivative. ^(c)These acceptors wereFCHASE-aminophenyl-glycosides derivatives.

^(e)Not determined.

[0171] When P^(k)-FCHASE (Gal-α-(1→4)-Gal-β-(1→4)-Glc-FCHASE) was usedas the acceptor in 2 hour reactions, the sialylation yield was 84% witheither N-acetyl- or N-propionyl-neuraminic acid while it was 55% withN-glycolyl-neuraminic acid (Table 1). We had observed previously thatP^(k)-FCHASE was a substrate for the α-2,3-sialyltransferase but it wasfound to have a k_(cat)/K_(m) 4 to 40-fold lower than substrates whichhave terminal β-Gal (Gilbert et al. (1997) Eur. J. Biochem. 249:187-194). N-glycolyl-neuraminic acid gave the lowest sialylation yieldswith the three acceptors tested, which is not surprising since theNeisseria CMP-Neu5Ac synthetase had a K_(m) that was 8-fold higher withN-glycolyl-neuraminic acid than with N-acetyl-neuraminic acid (Gilbertet al. (1997) Biotechnol. Lett. 19: 417-420).

[0172] The fusion protein can also use branched oligosaccharides asacceptors since we observed >99% sialylation of an asialo-galactosylatedbiantennary N-linked type oligosaccharide using N-acetyl-neuraminic acidas the donor (Table 1). This reaction was done at the 1 mg scale usingthe underivatized oligosaccharide and the mass of the isolated product(2224.0 Da) was found to agree with the mass of the expecteddi-sialylated biantennary oligosaccharide (2223.3 Da).

[0173] Use in a 100 g Scale Synthesis

[0174] The material extracted with Triton X-100 from the PEG/NaClprecipitate was used in a 100 g scale synthesis to produceα-2,3-sialyllactose using the sialyltransferase cycle (Ichikawa et al.(1991) J. Am. Chem. Soc. 113: 4698-4700) starting from lactose, sialicacid, phosphoenolpyruvate (PEP), and catalytic amounts of ATP and CMP.After 6 days of reaction, the reaction had reached completion asevidenced by the disappearance of sialic acid by TLC analysis. Theproduct was then purified by a sequence of ultrafiltration,nanofiltration and ion exchange. This process yielded 77 g of a whitesolid which had an α-2,3-sialyllactose content of 88% and a watercontent of 7%. Based on the α-2,3-sialyllactose content of the isolatedproduct, the overall yield for the synthesis and isolation was 68%.

[0175] Discussion

[0176] The CMP-Neu5Ac synthetase/α-2,3-sialyltransferase fusion proteinwas expressed at high level in a cost-effective expression system andshowed both enzyme activities at levels comparable to those of theindividual enzymes. It was readily recoverable by a simple protocolinvolving precipitation and detergent extraction, therefore avoidingexpensive chromatographic steps. The detergent extracted fusion proteinwas functionally pure, i.e. it was free from contaminating enzymeactivities that can hydrolyze sugar nucleotides or other components ofthe cofactor regeneration system.

[0177] To be useful for large scale carbohydrate synthesis the fusionprotein should be applicable in a sugar nucleotide cycle. This cycle isdesigned to use only catalytic amounts of expensive sugar nucleotidesand nucleoside phosphates, which are enzymatically regenerated in situfrom low-cost precursors. The recycling of the converted co-factors alsoprevents end-product inhibition. The α-2,3-sialyllactose 100 g scalesynthesis went to completion, which is important since stoichiometricconversion of substrates is desirable not only to minimize reagent costsbut also because it greatly simplifies the purification of the productfrom a large scale synthesis. Another interesting feature of the fusionprotein is that it can use directly different donor analogs and variousacceptors with a terminal galactose residue. Consequently it can be usedfor the synthesis of both natural carbohydrates and syntheticderivatives with novel properties.

[0178] The CMP-Neu5Ac synthetase/α-2,3-sialyltransferase fusion proteinwas expressed in high yield in E. coli with the two components being atleast as active as the separate enzymes, which indicates that they werefolded properly. This example suggests that construction and expressionof fusion proteins may be of general utility to produce the enzymesrequired for large-scale biotechnological processes involving multipleenzymatic steps.

Example 2

[0179] Construction of a UDP-GlucoseEpimerase/β-1,4-Galactosyltransferase Fusion Protein

[0180] The use of sugar nucleotide cycling systems (SNC) oligosaccharidesynthesis requires a number of enzymes. The purification of theseenzymes is a time consuming and expensive part of the process. In thefirst example we,produced a fusion protein which combines a transferasewith its corresponding sugar-nucleotide synthetase (FUS-01), and haveshown the advantages of a simple purification of the two activities. Inthis example we have produced a fusion of two other proteins used in SNCreactions, the UDP-Glucose 4 epimerase (galE) and aβ-1,4-galactosyltransferase (lgtB).

[0181] Materials and Methods

[0182] DNA Manipulations

[0183] The S. thermophilus UDP-glucose 4′ epimerase (galE) gene wasamplified from pTGK-EP 1 using primers derived from the nucleotidesequence of galE from Streptococcus thermophilus (GenBank accessionM38175). GalE-5p was used as the 5′ primer (58 mer:5′-GGGACAGGATCCATCGATGCTTAGGAGGTCATATGGCAATTT TAGTATTAGGTGGAGC-3′ (SEQID NO: 9); the BamHI site is in bold and italics)(primers used in thisExample are shown in FIG. 4) and GalE-3p as the 3′ primer (42-mer:5′-GGGGGGGCTAGCGCCGCCTCCTCGATCATCGTACCCTTTTGG-3′ (SEQ ID NO: 10); theNheI site is in italics). The plasmid pTGK/EP1, which includes the galEgene was used (see, PCT Patent Application Publ. No. WO98/20111) as thetemplate.

[0184] The Neisseria β-1,4-galactosyltransferase was amplified usingLgtB-NheI as the 5′ primer (38-mer:5′-GGGGGGGCTAGCGTGCAAAACCACGTTATCAGCTTAGC-3′ (SEQ ID NO: 11); the NheIsite is in italics) and LgtB-SalI as the 3′ primer (45-mer:5′-GGGGGGGTCGACCTATTATTGGAAAGGCACAATGAACTGTTCGCG-3′ (SEQ ID NO: 12); theSalI site is in italics) and using pCW-lgtB(MC58) (Wakarchuk et al.(1998) Protein Engineering 11: 295-302) as the template. Thethermocycler parameters were 94° C. 3 min., and 30 cycles of 55° C. 30sec., 72° C. 30 sec., 94° C. 30 sec. PCR was performed with Pwopolymerase as described by the manufacturer (Boehringer Mannheim, Laval,Que.). The nucleotide (SEQ ID NO: 13) and deduced amino acid (SEQ ID NO:14) sequences of the Neisseria β-1,4-galactosyltransferase are shown inFIG. 2.

[0185] The plasmid pFUS-EB was constructed as follows (FIG. 3). TheUDP-glucose 4 epimerase PCR product was digested with BamHI and NheI andthe β-1,4-galactosyltransferase PCR product was digested with NheI andSalI and then recovered from the reaction mixtures using Prep-a-Gene™resin according to the manufacturer's instruction (BioRad). The twogenes were then combined in a three fragment ligation under standardconditions with the vector pCWori⁺ (Wakarchuk et al. (1994) ProteinScience 3: 467-475) that had been digested with BamHI and SalI. DNA wasintroduced into E. coli DH12S using electroporation with 1 μl of theligation reaction. Transformants were screened using colony PCR withprimers specific for vector sequences flanking the cloning site.Colonies with inserts of the correct size, were then grown in liquidculture and tested for enzyme activity.

[0186] Determination of Enzyme Activity

[0187] Standard reactions for the β-1,4-galactosyltransferase enzymewere performed at 37° C. in 20 μl of: HEPES-NaOH buffer 50 mM, pH 7.5containing, 10 mM MnCl₂, 1.0 mM fluorescein labeled acceptor, 1.0 mMUDP-Gal donor and various amounts of enzyme extract from recombinant E.coli that contains the cloned gene. The preparation of the fluoresceinlabeled acceptors was as described in Wakarchuk et al. (1996) J. Biol.Chem. 271 (32): 19166-19173 and Wakarchuk et al. (1998) ProteinEngineering 11: 295-302.

[0188] Reactions to assess the epimerase-transferase fusion protein wereperformed with 1.0 mM UDP-Glucose in place of UDP-Gal. Enzymes wereassayed after dilution of extracts in buffer containing 1 mg/mlacetylated bovine serum albumin. For calculation of enzyme activity, theenzyme dilutions were chosen such that for reaction times of 5-15minutes approximately 10% conversion of the acceptor to product would beachieved. The reactions were terminated either by the addition of anequal volume of 2% SDS and heated to 75° C., for 3 minutes, or bydiluting the reaction with 10 mM NaOH. These samples were then dilutedappropriately in water prior to analysis by capillary electrophoresis(Wakarchuk et al. (1996) supra.).

[0189] Small scale extracts were made as follows. The cells werepelleted in an 1.5 ml microcentrifuge tube 2 min. at maximum speed, andthe medium discarded. The pellet was frozen and then mixed with 2volumes of 150 μm glass beads (Sigma), and ground with a glass pestle inthe microcentrifuge tube. This mixture was then extracted twice with 50μl of 50 mM HEPES-NaOH pH 7.5. The supernatant from this was used as thesource of material for enzyme assays. Larger scale extractions and thePEG-8000 precipitation were performed as described in Gilbert et al.(1998) Nature Biotechnology 16: 769-772.

[0190] To verify that the product from reactions with theepimerase-transferase fusion using UDP-Glc wasGal-β-1,4-GlcNac-aminophenyl-FEX (FEX-LacNAc), reaction products wereseparated by TLC and then eluted in methanol. After drying under vacuum,the samples were dissolved in water and glycosidase assays wereperformed as described in Wakarchuk et al. (1996), supra. These sampleswere then analyzed by TLC against standards of the FEX-LacNAc and thedegradation product, FEX-GlcNAc (data not shown).

[0191] Results

[0192] The pFUS-EB construct was investigated for its inductionkinetics. The fusion protein was inducible, but the enzyme activityaccumulates to its highest level in shake flasks without any IPTG beingadded. Activity of the fusion protein was measured with either UDP-Galor UDP-Glc as the donor. Assays performed using FEX-GIcNAc as anacceptor show the amount of transferase activity using UDP-Glc as thedonor is similar to the amount of transferase activity using UDP-Gal asthe donor. The level of expression is such that from 1 L of shakeflaskculture between 130-200 U of are produced.

[0193] With the CMP-NANA/α-2,3-sialyltransferase fusion protein, we haveshown the utility of concentrating the enzyme with PEG-8000/NaClprecipitations (Example 1). We have investigated using PEG-8000/NaCl forrecovery of the β-1,4-galactosyltransferase fusion/UDP-glucose 4epimerase fusion polypeptide from the cell free extracts. Since itappears to be a very soluble protein, we used 16% PEG-8000, which is ahigher level than we had used for the other fusion protein. We did notsee any adverse affects on enzyme activity after the PEG-8000 recoverystep. It appears that the protein is not inhibited by the PEGprecipitation step, and that recovery of active protein is high. It alsoappears that when the activity is measured in samples with higherconcentrations of enzyme, using pre-formed UDP-Gal, that the activity islower. This may be because the epimerase converts some of the UDP-Galback to UDP-Glc, which makes the activity appear lower.

[0194] It is understood that the examples and embodiments describedherein are for illustrative purposes only and that various modificationsor changes in light thereof will be suggested to persons skilled in theart and are to be included within the spirit and purview of thisapplication and scope of the appended claims. All publications, patents,and patent applications cited herein are hereby incorporated byreference for all purposes.

1 18 1 828 DNA Neisseria meningitidis CDS (1)..(828)beta-1,4-galactosyltransferase (lgtB) 1 atg caa aac cac gtt atc agc ttagct tcc gcc gca gaa cgc agg gcg 48 Met Gln Asn His Val Ile Ser Leu AlaSer Ala Ala Glu Arg Arg Ala 1 5 10 15 cac att gcc gat acc ttc ggc aggcac ggc atc ccg ttt cag ttt ttc 96 His Ile Ala Asp Thr Phe Gly Arg HisGly Ile Pro Phe Gln Phe Phe 20 25 30 gac gca ctg atg ccg tct gaa agg ctggaa cag gca atg gcg gaa ctc 144 Asp Ala Leu Met Pro Ser Glu Arg Leu GluGln Ala Met Ala Glu Leu 35 40 45 gtc ccc ggc ttg tcg gcg cac ccc tat ttgagc gga gtg gaa aaa gcc 192 Val Pro Gly Leu Ser Ala His Pro Tyr Leu SerGly Val Glu Lys Ala 50 55 60 tgc ttt atg agc cac gcc gta ttg tgg aag caggca ttg gac gaa ggt 240 Cys Phe Met Ser His Ala Val Leu Trp Lys Gln AlaLeu Asp Glu Gly 65 70 75 80 ctg ccg tat atc acc gta ttt gag gac gac gtttta ctc ggc gaa ggt 288 Leu Pro Tyr Ile Thr Val Phe Glu Asp Asp Val LeuLeu Gly Glu Gly 85 90 95 gag gaa aaa ttc ctt gcc gaa gac gct tgg ctg caagaa cgc ttt gac 336 Glu Glu Lys Phe Leu Ala Glu Asp Ala Trp Leu Gln GluArg Phe Asp 100 105 110 ccg gat acc gcc ttt atc gtc cgc ttg gaa acg atgttt atg cac gtc 384 Pro Asp Thr Ala Phe Ile Val Arg Leu Glu Thr Met PheMet His Val 115 120 125 ctg acc tcg ccc tcc ggc gtg gcg gat tac tgc gggcgc gcc ttt ccg 432 Leu Thr Ser Pro Ser Gly Val Ala Asp Tyr Cys Gly ArgAla Phe Pro 130 135 140 ctg ttg gaa agc gaa cac tgg ggg acg gcg ggc tatatc att tcc cga 480 Leu Leu Glu Ser Glu His Trp Gly Thr Ala Gly Tyr IleIle Ser Arg 145 150 155 160 aaa gcg atg cgg ttt ttc ctg gac agg ttt gccgcc ctg ccg ccc gaa 528 Lys Ala Met Arg Phe Phe Leu Asp Arg Phe Ala AlaLeu Pro Pro Glu 165 170 175 ggg ctg cac ccc gtc gat ctg atg atg ttc agcgat ttt ttc gac agg 576 Gly Leu His Pro Val Asp Leu Met Met Phe Ser AspPhe Phe Asp Arg 180 185 190 gaa gga atg ccg gtt tgc cag ctc aat ccc gccttg tgc gcc caa gag 624 Glu Gly Met Pro Val Cys Gln Leu Asn Pro Ala LeuCys Ala Gln Glu 195 200 205 ctg cat tat gcc aag ttt cac gac caa aac agcgca ttg ggc agc ctg 672 Leu His Tyr Ala Lys Phe His Asp Gln Asn Ser AlaLeu Gly Ser Leu 210 215 220 atc gaa cac gac cgc ctc ctg aac cgc aaa cagcaa agg cgc gat tcc 720 Ile Glu His Asp Arg Leu Leu Asn Arg Lys Gln GlnArg Arg Asp Ser 225 230 235 240 ccc gcc aac aca ttc aaa cac cgc ctg atccgc gcc ttg acc aaa atc 768 Pro Ala Asn Thr Phe Lys His Arg Leu Ile ArgAla Leu Thr Lys Ile 245 250 255 agc agg gaa agg gaa aaa cgc cgg caa aggcgc gaa cag ttc att gtg 816 Ser Arg Glu Arg Glu Lys Arg Arg Gln Arg ArgGlu Gln Phe Ile Val 260 265 270 cct ttc caa taa 828 Pro Phe Gln 275 2275 PRT Neisseria meningitidis 2 Met Gln Asn His Val Ile Ser Leu Ala SerAla Ala Glu Arg Arg Ala 1 5 10 15 His Ile Ala Asp Thr Phe Gly Arg HisGly Ile Pro Phe Gln Phe Phe 20 25 30 Asp Ala Leu Met Pro Ser Glu Arg LeuGlu Gln Ala Met Ala Glu Leu 35 40 45 Val Pro Gly Leu Ser Ala His Pro TyrLeu Ser Gly Val Glu Lys Ala 50 55 60 Cys Phe Met Ser His Ala Val Leu TrpLys Gln Ala Leu Asp Glu Gly 65 70 75 80 Leu Pro Tyr Ile Thr Val Phe GluAsp Asp Val Leu Leu Gly Glu Gly 85 90 95 Glu Glu Lys Phe Leu Ala Glu AspAla Trp Leu Gln Glu Arg Phe Asp 100 105 110 Pro Asp Thr Ala Phe Ile ValArg Leu Glu Thr Met Phe Met His Val 115 120 125 Leu Thr Ser Pro Ser GlyVal Ala Asp Tyr Cys Gly Arg Ala Phe Pro 130 135 140 Leu Leu Glu Ser GluHis Trp Gly Thr Ala Gly Tyr Ile Ile Ser Arg 145 150 155 160 Lys Ala MetArg Phe Phe Leu Asp Arg Phe Ala Ala Leu Pro Pro Glu 165 170 175 Gly LeuHis Pro Val Asp Leu Met Met Phe Ser Asp Phe Phe Asp Arg 180 185 190 GluGly Met Pro Val Cys Gln Leu Asn Pro Ala Leu Cys Ala Gln Glu 195 200 205Leu His Tyr Ala Lys Phe His Asp Gln Asn Ser Ala Leu Gly Ser Leu 210 215220 Ile Glu His Asp Arg Leu Leu Asn Arg Lys Gln Gln Arg Arg Asp Ser 225230 235 240 Pro Ala Asn Thr Phe Lys His Arg Leu Ile Arg Ala Leu Thr LysIle 245 250 255 Ser Arg Glu Arg Glu Lys Arg Arg Gln Arg Arg Glu Gln PheIle Val 260 265 270 Pro Phe Gln 275 3 41 DNA Artificial SequenceDescription of Artificial SequenceSYNTM-F1 5′ primer 3 cttaggaggtcatatggaaa aacaaaatat tgcggttata c 41 4 45 DNA Artificial SequenceDescription of Artificial SequenceSYNTM-R6 3′ primer 4 cgacagaattccgccaccgc tttccttgtg attaagaatg ttttc 45 5 37 DNA Artificial SequenceDescription of Artificial SequenceSIALM-22F 5′ primer 5 gcatggaattctgggcttga aaaaggcttg tttgacc 37 6 59 DNA Artificial SequenceDescription of Artificial SequenceSIALM-23R 3′ primer 6 cctaggtcgactcattagtg gtgatggtgg tgatggttca ggtcttcttc gctgatcag 59 7 9 PRTArtificial Sequence Description of Artificial Sequencelinker ofpFUS-01/2 7 Gly Gly Gly Ile Leu Ser His Gly Ile 1 5 8 8 PRT ArtificialSequence Description of Artificial Sequencelinker of pFUS-01/4 8 Gly GlyGly Ile Leu Ser Gly Ile 1 5 9 58 DNA Artificial Sequence Description ofArtificial SequenceGalE-5p 5′ primer 9 gggacaggat ccatcgatgc ttaggaggtcatatggcaat tttagtatta ggtggagc 58 10 42 DNA Artificial SequenceDescription of Artificial SequenceGalE-3p 3′ primer 10 gggggggctagcgccgcctc ctcgatcatc gtaccctttt gg 42 11 38 DNA Artificial SequenceDescription of Artificial SequenceLgtB-NheI 5′ primer 11 gggggggctagcgtgcaaaa ccacgttatc agcttagc 38 12 45 DNA Artificial SequenceDescription of Artificial SequenceLgtB-SalI 3′ primer 12 gggggggtcgacctattatt ggaaaggcac aatgaactgt tcgcg 45 13 10 PRT Artificial SequenceDescription of Artificial Sequencepeptide linker 13 Gly Gly Gly Ile LeuSer His Gly Ile Leu 1 5 10 14 6 PRT Artificial Sequence Description ofArtificial Sequence6-His tail for purification 14 His His His His HisHis 1 5 15 5 PRT Artificial Sequence Description of ArtificialSequencepeptide linker 15 Gly Gly Ala Ser Val 1 5 16 63 DNA ArtificialSequence Description of Artificial Sequencejunction region of thegalE-lgtB fusion 16 cca aaa ggg tac gat gat cga gga ggc gga gct agc gtgcaa aac cac 48 Pro Lys Gly Tyr Asp Asp Arg Gly Gly Gly Ala Ser Val GlnAsn His 1 5 10 15 gtt atc agc tta gct 63 Val Ile Ser Leu Ala 20 17 21PRT Artificial Sequence Description of Artificial Sequencejunctionregion of the galE-lgtB fusion 17 Pro Lys Gly Tyr Asp Asp Arg Gly GlyGly Ala Ser Val Gln Asn His 1 5 10 15 Val Ile Ser Leu Ala 20 18 4 PRTArtificial Sequence Description of Artificial Sequencepeptide linker 18Gly Gly Gly Ile 1

What is claimed is:
 1. A nucleic acid which comprises a polynucleotidethat encodes a fusion polypeptide, wherein the fusion polypeptidecomprises: a) a catalytic domain of a glycosyltransferase; and b) acatalytic domain of an accessory enzyme which catalyzes a step in theformation of a nucleotide sugar which is a saccharide donor for theglycosyltransferase.
 2. The nucleic acid of claim 1, wherein theglycosyltransferase is a eukaryotic glycosyltransferase.
 3. The nucleicacid of claim 1, wherein the accessory enzyme is a eukaryotic accessoryenzyme.
 4. The method of claim 2, wherein the catalytic domain of theglycosyltransferase substantially lacks one or more of a cytoplasmicdomain, a signal-anchor domain, and a stem region of theglycosyltransferase.
 5. The nucleic acid of claim 1, wherein theglycosyltransferase is a prokaryotic glycosyltransferase.
 6. The nucleicacid of claim 1, wherein the accessory enzyme is a prokaryotic accessoryenzyme.
 7. The nucleic acid of claim 1, wherein the fusion polypeptidefurther comprises a catalytic domain of a second accessory enzyme. 8.The nucleic acid of claim 1, wherein the glycosyltransferase is selectedfrom the group consisting of sialyltransferases,N-acetylglucosaminyltransferases, N-acetylgalactosaminyltransferases,fucosyltransferases, galactosyltransferases, glucosyltransferases,glucuronosyltransferases, xylosyltransferases, and mannosyltransferases.9. The nucleic acid of claim 1, wherein the accessory enzyme is selectedfrom the group consisting of: a GDP-mannose dehydratase, a GDP-mannose3,5-epimerase, and a GDP-mannose 4-reductase; a UDP-glucose 4′epimerase; a UDP-GalNAc 4′ epimerase; a CMP-sialic acid synthetase; aneuraminic acid aldolase; an N-acetylglucosamine 2′ epimerase; aphosphate kinase selected from the group consisting of a pyruvatekinase, a myokinase, a creatine phosphate kinase, an acetyl phosphatekinase, and a polyphosphate kinase; and a pyrophosphorylase selectedfrom the group consisting of a UDP-Glc pyrophosphorylase, a UDP-Galpyrophosphorylase, a UDP-GalNAc pyrophosphorylase, a GDP-mannosepyrophosphorylase, a GDP-fucose pyrophosphorylase, and a UDP-GlcNAcpyrophosphorylase.
 10. The nucleic acid of claim 1, wherein thenucleotide sugar is selected from the group consisting of GDP-Man,UDP-Glc, UDP-Gal, UDP-GlcNAc, UDP-GalNAc, CMP-sialic acid, GDP-Fuc, andUDP-xylose.
 11. The nucleic acid of claim 1, wherein theglycosyltransferase is a sialyltransferase and the nucleotide sugar isCMP-sialic acid.
 12. The nucleic acid of claim 11, wherein the accessoryenzyme is a CMP-sialic acid synthetase.
 13. The nucleic acid of claim11, wherein the accessory enzyme is a neuraminic acid aldolase or anN-acetylglucosamine 2′ epimerase.
 14. The nucleic acid of claim 1,wherein the glycosyltransferase is a galactosyltransferase and thenucleotide sugar is UDP-galactose.
 15. The nucleic acid of claim 14,wherein the accessory enzyme is a UDP-glucose 4′ epimerase.
 16. Thenucleic acid of claim 1, wherein the glycosyltransferase is afucosyltransferase and the nucleotide sugar is GDP-fucose.
 17. Thenucleic acid of claim 16, wherein the accessory enzyme is selected fromthe group consisting of a GDP-mannose dehydratase, a GDP-mannose3,5-epimerase, a GDP-fucose pyrophosphorylase, and a GDP-mannose4-reductase.
 18. The nucleic acid of claim 1, wherein theglycosyltransferase is an N-acetylgalactosaminyltransferase and thenucleotide sugar is UDP-GalNAc.
 19. The nucleic acid of claim 18,wherein the accessory enzyme is a UDP-GalNAc 4′ epimerase.
 20. Thenucleic acid of claim 1, wherein the glycosyltransferase is anN-acetylglucosaminyltransferase and the nucleotide sugar is UDP-GlcNAc.21. The nucleic acid of claim 20, wherein the accessory enzyme is aUDP-GalNAc 4′ epimerase.
 22. The nucleic acid of claim 1, wherein theglycosyltransferase is a mannosyltransferase and the nucleotide sugar isGDP-Man.
 23. The nucleic acid of claim 1, wherein the fusion polypeptidefurther comprises a linker peptide between the glycosyltransferasecatalytic domain and the accessory enzyme catalytic domain.
 24. Thenucleic acid of claim 1, wherein the nucleic acid further comprises apolynucleotide that encodes a signal sequence which is linked to thefusion polypeptide.
 25. The nucleic acid of claim 1, wherein the nucleicacid further comprises a polynucleotide that encodes a molecular tagwhich is linked to the fusion polypeptide.
 26. An expression vectorwhich comprises a nucleic acid of claim
 1. 27. A host cell whichcomprises a nucleic acid of claim
 1. 28. A fusion polypeptide encoded bya nucleic acid of claim
 1. 29. A fusion polypeptide that comprises: a) acatalytic domain of a glycosyltransferase; and b) a catalytic domain ofan accessory enzyme which catalyzes a step in the formation of anucleotide sugar which is a donor for the glycosyltransferase.
 30. Thefusion polypeptide of claim 29, wherein the catalytic domain of theglycosyltransferase is joined to the carboxy terminus of the accessoryenzyme catalytic domain.
 31. The fusion polypeptide of claim 29, whereinthe glycosyltransferase is a galactosyltransferase and the accessoryenzyme is a UDP-glucose 4′ epimerase.
 32. The fusion polypeptide ofclaim 29, wherein the glycosyltransferase is a sialyltransferase and theaccessory enzyme is a CMP-sialic acid synthetase.
 33. A method ofproducing a fusion polypeptide that comprises: a) a catalytic domain ofa glycosyltransferase; and b) a catalytic domain of an accessory enzymewhich catalyzes a step in the formation of a nucleotide sugar which is adonor for the glycosyltransferase; wherein the method comprisesintroducing a nucleic acid that encodes the fusion polypeptide into ahost cell to produce a transformed host cell; and culturing thetransformed host cell under conditions appropriate for expressing thefusion polypeptide.
 34. The method of claim 33, wherein the fusionpolypeptide is purified following its expression.
 35. The method ofclaim 33, wherein the host cell is permeabilized following expression ofthe fusion polypeptide.